Transcript for:
Estimation and Psychometrics in Software Development

all right all right uh thank you for staying sticking with us with a little delay but I'm not going to delay any longer Joseph um he's a theoretically retired psychologist but he's also a Pioneer in iob methods and a master applying psychology to process automations nearly 30 years of experience helping some of the world's most well-known companies improve their abilities to satisfies their needs of the customers and even though retired officially he's still pushing the boundaries of agile welcome on to this to the stage a round of applause please everybody come on some give me some love for Joseph yes yeah uh theoretically retired my wife and I are both retired but we said the Rolling Stones have been doing their farewell tour for 20 years now we don't need to stop so how many of you hate it when your manager comes and says I need an exact estimate of how long you're going to do this be hands up right God yeah so if there's one thing that software developers do well it's develop software and some of you are really good at okay some of you think you're better than you actually are at least that's what professor's done and Krueger think but some of you are really experts and the problem with being an expert is the more you become an expert the more conservative you become the more myopic and shortsighted become the more you're captured in the bubble of your expertise and unable to think outside of it right you might be surprised to know that other people also estimate they might not call it estimation but they also estimate economists estimate potential rates of inflation financial advisors try to estimate changes in the stock market for their investment clients meteorologists estimate the weather political commentators estimate who might win an election doctors estimate the chances of recovery or the chances of death of a patient every young man who's scared thinking about whether to ask that young lady or other young man out on a date estimates their chances of success right and last but not least there are all those business and product managers themselves who are trying to estimate Roi and how often do they get it right okay so the problem is when you're a software developer in this bubble with other software developers this bubble becomes an eoch chamber Echo chamber Echo chamber Echo chamber Echo chamber and what eventually happens there is it has become a cult and I really wanted to get a t-shirt with this but I didn't get around this do you recognize all of this especially agile developers the fault of bad estimation right but you might also be surprised to note that there is a science of how people estimate and this science is called psychometrics it's an advanced branch of psychology and what I would like to do is spend some time talking with you about it so I'd like to invite you to suspend your disbelief for half an hour or so and let me take you just on a journey of looking at estim from a different point of view so there are three things we're going to talk about what the problem really is and give you one approach to solving it I'm going to show you a couple estimation fallacies how to deal with them and a couple new and estimation techniques and how to use them so the problem any of you who know Star Trek know the kobashi Maru if you don't know it the civilian ship the kobashi Maru is stranded in the neutral zone between the Federation and the Klingons this is a test that was done to the Starfleet Academy for potential commanders you're commanding a Starship which is a warship and the civilian ship is stuck in the middle and you have the choice either you go in with your warship to save them and the Klingons will see that as a declaration of war and they will attack you or you just leave it and the crew of the kobashi Maru die what do you do right estimation is a kobashi Maru situation it's a no win damned if you do damned if you don't and I tried to avoid bringing long quotes but this one from my friend Raquel at Google is a great one story points are like your failed hails to the cling on ships asking them please can you come in and save the ship you know it's useless they'll ignore it but at least when Starfleet finds the law books next to your rotting corpse you can rest in the satisfaction saying that you tried according to the book and it's all the Kling On's fault right so this is a problem how do you deal with this how do you deal with this no- win situation well there's an obvious solution and some of you might even be old enough to recognize this it's a strange game the only winning move is not to play no estimates is a perfect absurd solution to a perfectly absurd demand for an accurate estimate right and in that context I think it's great but back to thing estimating in time isn't the problem the political consequences of estimating in time is the problem so what is the psychometric stuff psychometrics is the science of estimating essentially this is the way we measure what we call a latent construct in psychology a latent construct is something you can't directly measure so I could measure your height I could measure your weight I wouldn't want to embarrass you so I can measure both of those what I can't directly measure is your intelligence so what I do as a psychologist is I develop tests that can indirectly measure this so what psychometrics teaches us is how to develop those tests how to ask better questions and how to run better statistics by remembering that people are really weird when they answer questions okay so what do I mean by really weird let's to a take a look at a couple estimation fallacies an estimate says is much about the person estimating as it does about the work to be estimated right first strategic misrepresentation you might not believe this but there are people who will underestimate or overestimate on purpose right there people will underestimate or overestimate the value of things at different times on perp purpose that was supposed to be a joke take a look at it right so be aware of that fallacy you should ask yourself if anyone in your team would have a reason for overestimating or underestimating the amount of time you need or the amount of work that needs to be done on something next the relativity fallacy or as I paraphrase Albert Einstein how long just a minute is depends on which side of that closed toilet door you're standing okay so remember just a minute I commonly agree to be 60 seconds but just depends okay so be aware of this fallacy make sure that the estimates come from the people who will actually be doing the work and not the manager says hey come and do this it will only take you minute wait this is the first time I have people coming on stage while I'm talking okay next one the planning fallacy the planning fallacy is one of the things that brought Daniel canman oh sorry f with this did you just change something on me no that's not what's up here let me go back another one okay now let's try again let's synchronize okay good the plan planning fallacy the planning fallacy is what brought Daniel canaman the Nobel prize in economics amamos tki unfortunately passed away before then the planning fallacy says that people tend to underestimate when they go too much into little de detail about something rather than comparing it to similar things that they have done already right and this is a very powerful fallacy right underestimation so the important thing here is collect data on your estimates and the actual times that you needed and use that information to build reference classes for estimates we'll get to reference classes later I tell you how to use those so let's get to one that's really scary the completeness fallacy um this is an excellent book to read it's a bit gory it's an excellent book to read uh Gand writes about what it takes for a young doctor to become qualified as a surgeon now the biggest problem of a young doctor becoming qualified as a surgeon is to find enough people to cut up doctors are constantly interns are looking for patients that they can practice on right and a doctor will come into the patient and said oh you need this operation and the patient says well what are the chances of success and the doctor May say Okay 80 85% but as guanda says do we ever tell the patients that because we're still just practicing this stuff their risks are going to be higher they'd probably be better with one of the head doctors who's more experienced do we ever say say that we need them to do it anyway so we can get our practice I've never seen it given the stakes who in their right mind would agree to be practiced on right do you have all the information you need so let's get into psychometrics this is the basic equation from item response Theory the estimated value of x let's say that X is something that needs to be done equals the True Value which is something we cannot not directly measure except by doing it which is why we're estimating it it's the True Value Plus a systemic error which is an error that we have in the way we're measuring it plus a random measurement error that Creeps in okay so now this True Value here dimensionality of estimates the time needed for this is a function of three variables it's a function of the work that needs to be done the person or people doing the work and the instruments and tools available to that person plus something we'll call a load Factor now we think about that the risk or the chances of success of an operation will depend not only what kind of operation is it is it being done by the head doctor or an intern is it being done in surgical ward of a major Hospital in London or someplace out on the GZA strip be honest about it those all these things play a role and all of these need to be considered as factors in your estimate a load Factor you probably know that a load factor is any impediment that will cause something to take longer and we all know these meetings facilities lack of the good tools right interference disruptions team problems personal changes unclarity in the requirements now a load factor in Psychology is something we call a confounding variable confounding means it can affect either the work to be done or the person doing it or the tools available to doing it which means it can screw up essentially everything so this is the important thing is to be aware and understand have you considered all the factors in your estimate as well as your potential load factors okay another hardcore one the distribution fallacy you're familiar with roulette probably in roulette when you bet on a number your chances of winning are what 2.6% or something like that it's a uniform distribution there's a uniform chance of of getting any one of the other 36 numbers or the two jokers totally flat distribution okay so an estimate though is a range and not a number and ever since Barry b wrote about estimating back in what was it 70s 80s or something like that people came up with this idea of the cone of uncertainty which means in the beginning when you don't know anything you're your range might be actually very big and by the time you get done you actually know how long you took it which has changed to have people understand or think that when you give an estimate your distribution of potential actuals over that estimate will be a goian distribution right this has been proven not to be so this is probably the most EAS accessible work for this uh this is a statistical model from bernardon based on Derek Jones's uh SI collection of over 15,000 estimates versus actuals it's fully documented the data sets out there all the code in R to analyze it is open source you can go and check this yourself and what bernardon discovered is that estimates versus actuals is not glossy a log normal distribution think about it most of the time you won't get it done quicker but there are a lot of reasons why it's going to take longer for you right you know what this screws up this screws up all those people who say that Monte Carlo simulations are the right thing to do at least all the Excel sheets I've checked to do this have their basic distribution being goian and not log normal they're skewing all their estimates be careful now going on about this Sprint capacity is also it's an inverse log normal distribution there's a point where you can't get any more done and this has something to do with the laws of the universe 24hour days and things like that but there are all these load factors and reasons that you're not going to be able to do everything everything right so how much are you going to commit to tracking your estimates versus your actuals so being aware of this fallacy says are you aware of the underlying probability distribution of any estimate you're giving especially considering the actuals next fallacy the groundhog fallacy thums up every day same thing every Sprint the same thing I'm not going going to get into a discussion about whether using velocity is a good idea or not let's leave that for beer afterwards but if you are using velocity important thing is are you using velocity as a descriptive or an inferential statistic are you using it to describe something or to predict how much you'll be able to do in the future and those are two different types of Statistics all right uh this book by Hindman and anopolis is an excellent book on forecasting I think one of the most interesting things here is the third Point whether the forecast can affect what is being forecast one of the greatest examples for that is brexit where the political commentators forecasted oh this is going to be stay nobody's going to vote for this so all the people who were going to vote stay well they stayed at home and didn't vote so the only people who went to vote were the ones who voted to leave right so how in your organization will any information about this forecast affect what's going to be forecast think about it right especially considering that the projected cost is often an objective criteria for prioritizing any develop velopment effort whether to do it or not to do it right okay if you're doing velocity you're doing it wrong oh okay I said great because if you're doing velocity standard way to do it is taking the arithmetic mean of the last n Sprints where N is a number between what we normally do three and five or something like that okay this is what you get it's the average value and you say this is about how much we're going to do nice friend there is a basic assumption here that is wrong that basic assumption is saying that all your data points are equally important that they're all equally weighted if you're using the last five two week Sprints that's saying that the way your team worked almost three months ago is exactly the way they're working right now is that true it may be but it's something that you really do need to question there is a better technique uh for those of you who know statistics and time series analysis is something called an exponentially weighted moving average where what you're doing is you're putting less and less value to the earlier data points and I think there's even something in the statistics add-on for Excel that let you automatically calculate this if you uh not you want to play with this standard Alpha for estimating in my experience is about 0.28 but one of the interesting experiments I did was do a random sample of 10,000 data points representing velocity of a team and then I ran four different experiments over saying with this prior I would try to predict what the next data point would be based on the things and see with this distribution how close it was so these are the results first one uh last observation carried forward technique also known as yesterday's weather had the worst effect arithmetic mean a lot better behold Winter's algorithm which is a Time series analysis method which calculates the alpha for an exponentially weighted moving average performed best and the median which is also funny because median tends to be the preferred method to use when you have a log normal distribution worked worse in this simulation than the mean did one slide I unfortunately took out because of lack of time is I mapped the residuals of the halt Winters the residuals the Hal Winters themselves are a log normal distribution which is quite interesting statistically so let's be aware of the fallacy know don't assume that your team is working the same as they were months ago and if they are is there something else that you can do to help them then the last one and then we're going to go on to other methods the objectivity fallacy do you recognize any of these I know what I'm doing we started with two story points and three max don't this exactly what I I think you guys are right based on that spec here recognize those cognitive biases now I used to give out a sheet with questions that you could run in your estimation session for cognitive biases to review this in the meantime I've actually gone over to doing a buzzword bingo version of it uh which is actually a lot of fun to be aware of the biases that are coming in when you're estimating so if you want to get better at this the way of avoiding this bias is download the sheet that's a QR code grab that or or ping me on LinkedIn or or wherever and I'll send you it okay it's it's quite quite a useful okay let's go on some estimation techniques I'm also going to cut this down a little bit this is actually the best part of a three-hour Workshop uh which means I'm leaving all the juicy bits out all estimates are wrong some are useful uh George box probably love me for that one of the first techniques I use I estimate in time but not only I estimate in time but excuse me I am not a mathematician nor do I play when on television let's play with the words let's say that estimating is a complex activity and if it is so estimates are complex numbers complex numbers have two parts right they have a real part and they have an imaginary part and so what we can do is we can refactor those two parts and deal with them separately so your real Parts the number the imaginary part is anxiety and anxiety angst fear that's in your subconscious and it's often based on things that happen to you you in the past all right remember when you were a little child and you told your mother I'm going to go down the street to play with my mate and she said yeah be back 5:00 for dinner and he said yeah I got this I said you having so much fun playing that was 5:00 then 5:30 then 6:00 about quarter past 6 your father came and grabbed you by the ear and dragged you home this is a traumatic memory that left its traces in you and you say man if I if I don't estimate right if I don't get this right I'm gonna be punished so I'll probably estimate a bit too much on this right so the first thing you need to do here is rationalize this bring it up to the surface say yes it's here when you rationalize this what you get is risk risk is rationalized anxiety is said yeah there is a risk that I might be wrong yeah happens let's accept that fact and deal with that fact and the next step you do is quantify it I just use a liquor scale from one to five five being essentially oh I would bet my job that I need exactly so much time to do this a one would be well Joseph just asked pass me in front of all these people so I'll say something but I don't believe it myself okay and what you get when you get that is value neutral information okay and then next thing I do is I teach the product owners or product managers to ask the question now not how what can I do to get your comfort level up but sorry what can I do to get the estimate better but how can I help you get your comfort level up and the answer to that normally is not I need more time the answer is I need more information and if you track your estimates and actuals and your comfort level track your data what you're going to find your comfort level is going to become your standard deviation if you have a comfort level of five that means you're pretty spot on with your estimates you have a comfort level of one so next thing next technique this is pretty much state-of-the-art reference class forecasting um the first job I had when I started my own company in 2001 was to write a an expert opinion for the German court for the planning software for the German train system because at that time 96% of German high-speed trains were over a half an hour late I got that gig because I was the one who wrote the planning software for Swiss Air back in the mid 90s okay and they got that and for those of you who know German you know what's what's written there in in white right and also it's interesting in you know in Switzerland where we actually we're really bad in comparison to Japan and Singapore but we're actually not bad in Europe an interesting fact is that at Deutsche Bond I had a whole group of people with their phds in physics who could calculate depending on the time of the year the weather humidity how many people in the train how all the things what was the gradient from one train station to another they could calculate the amount of time needed to get from one train station to the next plus minus 2 milliseconds right 96% of the trains were over a half an hour late but they could get down to two Mill this is the planning fallacy we flew from Zurich to Budapest yesterday pilot said flight would take an hour and 15 minutes you know what flight took an hour 17 minutes how did they do that did the part actually go out and measure everything no what airlines do is reference class forecasting reference SL forecasting is the answer to canan's planning fallacy this is the standard paper on it from flber essentially what you do to do this is you identify a relevant reference PA class of past similar projects it takes a bit of practice is to get enough that you have a decent sample size but it's narrow enough that it's it's going to be specific right and you calculate the probability distribution for that and it's probably going to be log normal right and then anytime you're estimating something say okay this is like something else that I have in the past you compare your specific project or task or whatever to the ref class distribution you you can regress it to the mean and use the range as your standard deviation of it is essentially what you do works very well another technique is something called unpacking unpacking was written up in this paper by Justin Krueger and Matt Evans Justin Krueger is the same one from Dunning Krueger and essentially what this is you know this as task breakdown Krueger says that if you don't go too much in detail but think through the individual steps you'll have better estimates because I explain it this way you won't have the oh effect the oh effect you've probably experiened this you do your planning you get into the Sprint after two days three days max somebody says oh we forgot right recognize that so that's that's all what that's about you know break it down so you're aware of all the steps that need to be done your dependen and things like that it's pure logic just something like that one other thing to remember estimates have a shelf life of Maximum one Sprint task breakdowns do two because your estimates are based on the V current value of two volatile variables first is the state of the system where you are will influence how long it takes you to get from point A to point B but as you're working on your system you're invalidating that by changing the system so you might want to think twice has this invalidated my estimates secondly estimates are based on the value of your knowledge and as you work and also as you do other things you learn your knowledge changes but that will invalidate the basis for your estimate so you might also want to think about it now there is an argument task breakdowns might also change how you go about doing it there's even an argument that your stories product backlog items requirements whatever also need to be reviewed in terms of the V their change in value their terms in change in Risk in terms of change in feasibility right you it might be might stay the same but it's a good idea good practice is to just review that and say has anything changed in the world around us that would cause us to review our estimates last thing and this is just a teaser this is something I'm working on right now is using basian inferencing now the cool thing about using basian inferencing in terms of inferencing is that this will give you essentially a reference class it will give you a statistical probability distribution and not a single data point for prediction the challenge here people will say well what happens if I don't have enough data for my prior is the same thing what if we don't have enough data for a reference class right what if we don't have enough data for the standard deviation to the Comfort level guess what if you don't have enough experience your estimates are going to just suck Bo that's what it tells you be honest about that and last thing use glucose a little Inside Story there glucose is fuel for the brain and there's something about gummy bears in there so this is real scientific research that I've been trying to show you you know there are a bunch of references done on this even more even more more more more this is a relatively short reference list in comparison to my psychology talks uh so before I finish I'd like to tell you a short story teachers stood in front of the class and asked the class hey does anyone have any questions and everybody's hand went up then the teacher said remember a question is when you want more information if you want to tell me something about yourself that's the discussion that happens later everybody's hands went down do you have any questions thank you all right all right all right hello hello okay I'm here thank you Joseph um I don't know I mean we've been talking about story points for the last you know 40 minutes or so um I want names who who invented story points okay I was afraid that this question would be asked the story as far as I know it right I worked for Chrysler on the C3 project back in the mid 90s I did their tooling hold it I actually have a slide here that shows this I pulled this out of a talk that r and chat did back application development back in 988 they estimated in terms of an ideal engineering week they always used time it just Ron got fed up with calling it an ideal engineering week so Ron called it a point and there's a big semantic difference here back when I started working with Kent is he came to Europe after Chrysler and I worked as assistant he said look we have this issue people like to estimate in time but we're not good at it why don't you come up with an alternative so I came up with an idea of an abstract scaler unit which I called the gummy bear because there was a gummy bear store around the corner from the bank where we were working and one side you need glucose for your brain another side these were really tasty and can't always wanted food food uh at meetings and third since this is the way we represented work we always hope that in the middle of the long discussion one of the developers would pick up a Gummy Bear the light bulb would go on and say well actually we're wasting time talking rather than doing work and the Gumi bear this idea of an abstract scaler unit is what became the story point I'm happy Ron's taken the fall for it but there's a difference between the point being uh another name for an ideal engineering week and story point being an abstract unit which people can't really figure out how much it is and that's one of the reasons why we left it and we should have buried it very deep with the stake through its heart it got out it got out okay thank um so maybe we got one time for one more question uh we have here a question from Tyrell do you think that the mythical man month is still relevant today mythical manat uh yeah in terms of what was it OS 360 probably not but there's a lot of stuff in there in mythical manm month that I really still value yeah a lot of stuff that that Brooks wrote that I would totally PR yeah definitely all right and you know what I'll entertain one more question um any recommendations on how to breach these ideas to project manager ERS I can tell all these arguments to them why they want unchangeable estimates anyways the problem with managers wanting really exact estimates is because then they push the responsibility for risk away from themselves where they should be doing it they push it on to developers and all they need to do is manage a plan this is where the problem is they want these hardcore estimates so that all they do is need to manage this plan and they push all the risk away and they don't want to deal I'm I'm sorry I was lucky to work with fantastic CFOs in my career and those were people who understood risk those are people who understood estimating in that way those are people who who helped me develop this idea of of comfort level because they're you know they're doing Financial predictions how much do you think uh inflation will increase next month how much do you think our Revenue will increase next month those sorry but those are the type of people you need as managers I'm sorry I can't answer anymore now but I'll be hanging around yes so another round of applause for Joseph please show him your love here's a little keep fake from the craft Hub Group there's a little pinka in there