6.046 - Lecture 2: Asymptotic Notation and Recurrences

um so my name is Eric Dain you should call me Eric um welcome back to 6046 this is lecture 2 um and today we're going to essentially fill in some of the more mathematical underpinnings of lecture one so lecture one we just sort of barely got our feet wet with some analysis of algorithms insertion sort and merge sort and we needed a couple of tools we had this big idea of astics and forgetting about constants just looking at the lead term and so today we're going to develop ASM totic notation to really uh so that we know that mathematically and we also ended up with a recurrence with merge sort the running time of merge sort so we need to see how to solve recurrences and we'll do that those two things today question can you speak up a little bit yes I will speak louder thanks good even though I have a microphone I'm not Amplified good okay so let's start with ASM totic notation so we've seen some basic ASM totic notation I'm sure you've seen it in other classes before things like Big O notation and today we're going to really Define this rigorously so we know what's true and what isn't what's valid and what isn't okay so we're going to Define and unfortunately today is going to be really mathematical and really no algorithms today which is sort of a anticlimax but next lecture we'll talk about real algorithms and we'll apply all the things we learn today to real algorithms so this is Big O notation capital O notation so we have F of n equal Big O of G of n this means that there are some suitable constants uh C and N such that f is bounded by C * G of n for all sufficiently large n okay so this is pretty intuitive notion we've seen it before we're going to assume that f ofn is uh non- negative here and I just want F ofn to be bounded above by G of n so we I we've seen a bunch of examples but something like uh 2 * n^ 2 is Big O of n cubed be fine um and roughly this means if you drop uh leading constants and lower order terms then this is less than or equal to that so big O corresponds roughly to less than or equal to um but this is the formalization another way to think of it formally as uh so a funny thing about this notation is this asymmetric normally you think of equality being symmetric a equals B then b equals a but it's not true here we don't have n Cub being Big O of of n^2 you don't even have big n Cub equal n^2 so we'll we'll see exactly what that means in a second but before we get there so this is a bit bizarre notation and you should always think about what it really means um another way to think about what it really means is that FN is in some set of functions that are like G so you could Define bigo of G of n to to be a set of functions let's call it f of n such that there exist constants the same definition nothing fancy here C and N such that we have the bound F of n is between zero and C * g v okay it's a bit of a long definition that's why we use the notation to avoid having to write this over and over uh so you can think of instead of n^2 being equal to Big O of n Cub what we really mean is that 2 n^2 is in the set Big O of n Cub so this when we write equal sign we in some sense mean um this in the set but we're going to use equal sign you could write this and occasionally you see papers that write this but this is the notation that that we're going to use so that has the consequence the equal sign is asymmetric just like this uh operator okay we have some Nifty ways that we actually use Big O notation it's using it as a macro by the way I'm we have a lot to cover today so I'm going to go relatively fast if anything is unclear just stop ask questions then I will slow down otherwise I'll take this as all completely obvious and I can keep going at full speed um okay so the convention this is intuitive I guess if you do some macro programming or something but it's a bit more mathematical so we've defined Big O notation and the equals Big O of something and so we've only defined Big O when on the equal sign we have Big O of some function but it's useful to have some general expression on the right hand side that involves Big O so for example uh let's say we have F of n is n Cub + order N2 so this is attempting to get an error bound this is saying FN is basically n Cub but there's these lower order terms that are Big O of N2 and so this means that there's a function so Shand for a function h of n which is in Big O of n^2 or equals Big O of n^2 such that F of n = n cubed + h of n so it's saying there's some lower order terms that are bounded above by some constant * n^ s for sufficiently large n and that's what's here and then F of n equals now this is a true equality n Cub plus that error term so this is very useful here essentially I'm expressing what the lead constant is and then saying well there's other stuff and it's all at most n^2 but saying that F of n therefore is also order n cubed but that's a bit weaker of a statement this is a bit more refined we won't need to use this too often but it's useful sometimes we'll see like in last class we even had a big O inside a summation so you can use them all over the place the point is they represent some function in that set um a bit less intuitive and this is more subtle is what it means to have bigo on the left hand side it means the same thing but uh there's some convention what equality means okay and this is why equal sign is asymmetric um equals means you should read equals like is so is means that everything over here is something over here so there's an implicit for all on the left hand side and there exists on the right hand side this is a true statement anything that is n^2 plus Big O of n is also Big O of n^ s but not the other way around so this is a bit asymmetric this if you think about it this is pretty intuitive um but it's subtle so you should be careful so this says for any expansion of the macro on the left hand side which should be F ofn there's an expansion of the macro on the right hand side such that we get equality okay and what this allow you to do is if you have a chain of equal sign relations a chain of ises then the very first one is equal to or bounded by the very last one so you can chain equal signs in the way you normally would you just can't flip them around good so that's big on notation any questions about that don't crush my fingers so big O is great for expressing upper bounds um but we also want to talk about lower bounds um so for algorithms we usually care about upper bounds on their running time running times at most n squ it's at most n log n up to Big O uh but sometimes we need to express functions that are at least some quantity uh for example we'll show that sorting requires at least an log end time in some model so we need some other notation for that and the notation is Big Omega notation and it's pretty symmetric I'll just write out the set definition here and we're going to write f ofn is Big Omega of G of n to mean f ofn is at least uh some constant time G of n for sufficiently large n so I'm basically just reversing the inequality relation between f and g nothing surprising just to have it there so random example and now we'll get a little bit more sophisticated root n is Big Omega of log n and you should read this that up to constant factors root n is at least log n for sufficiently large n and so this so Omega sort of corresponds to uh greater than or equal to so let me give you some analogies uh we have Big O we have big Omega um this is less than or equal to this is greater than or equal to and I'm going to fill in some more here in a moment so I mean it's nice to have all the usual operators we have so normally we have strick less than strick greater than and equal sign uh and we want those sort of analoges in the asmic world where we ignore constant factors and ignore lower lower order terms so we have for example Theta of G of n this is a capital Theta which means you write the horizontal bar in the middle as opposed to all the way through I didn't invent Greek so that's the way it is uh Theta means that you're less than or equal to and you're greater than or equal to uh up to constant factors so it's the intersection of these two sets Big O and big Omega so that's sort of like equal sign but of course this is very different so you have things like n² is Biga of 2 n^2 because you ignore constant factors but all of these other uh relations okay n^2 plus Big O of n is going to be Theta of n^2 uh but this does not hold with Theta because squ root of n is really bigger ASM totically bigger than log n uh and some of the other examples we saw like n^2 versus n Cub those are those don't hold with Theta okay and we have some strict notation which are the little o and little Omega notations there's no little Theta because there's no notion of strict equality versus un strict equality little o is going to correspond roughly to less than and little Omega is going to correspond to greater than this is notation you'll just have to get used to and I'm not going to Define it precisely here because it's almost exactly the same the difference is that instead of saying there exist constants C and N you have to say for every constant C there exists a constant n not so the relationship between f and g this inequality must hold for all C instead of just for one and so n can now depend on c um so you can assume that really n is sufficiently large but this gives you a strict inequality says no matter what constant you put here in front of G uh let's say we're doing little o no matter what constant you put in front of g f will be still less than C * G for sufficiently large n uh so we have some random examples and these are I mean so we're again ignoring constants n s is always less than n Cub for sufficiently large n and it's a bit subtle here I mean you could work out in order to prove something like this it will become intuitive uh after you manipulate it a bit you have to figure out what n0 is in terms of C I think it's something like 2 over c um that should if we have less than or equal to that should be right so as long as I choose as long as n is at least this big uh no matter how small of a c you should think of C here as being Epsilon now in the usual Epsilon and Deltas um as long no matter how small C gets still I can bound n^ s in terms of n cubed upper bound okay but you have whenever you have Theta you do not have either of these relations so for example 12 n^2 is Theta of n^2 and it's not little o of n^2 and it's not little Omega of n squ because it's exactly n s you get some sense an order relation out of this although there are some some messy behaviors as you'll see in your problem set so any questions about ASM totic notation that is the quick rundown now we're going to use it to solve some recurrences uh although we won't use it that much today we'll use it a lot more on Wednesday okay so we'll move on to the second topic of today which is solving recurrences probably solve some recurrences before in 6042 whatever discrete math class you've taken um we're going to do more and have some more techniques here that are particularly useful for analyzing recursive algorithms and we'll see that mostly on Wednesday um there's three main methods that we're going to use here for solving recurrences first one's the substitution method and I mean there's no General procedure for solving a recurrence there's no there's no good algorithm for solving recurrences unfortunately we just we have a bunch of techniques some of them work some of the time and if you're lucky yours will work for your recurrence but it's sort of like solving an integral you have to just know some of them you have to know various methods for solving them it's e it's usually easy to check when you have if you have the right answer just like with integrals you just differentiate see oh I got the right answer um and that's essentially the idea of substitution method so substitution method will always work um but unfortunately Step One is guess the answer and you have to guess it correctly uh that makes it a bit difficult you don't have to guess it completely you can usually get away with not knowing the constant factors which is a good thing because we don't really care about the constant factors so you guess the form you say oh it's uh got to be roughly n squ and so it's some constant times n squ presumably so you guess that we're going to figure out the constants you try to verify whether the recurrence is satis um satisfies this Bound by induction and that's the key substitution uses induction and from that you usually get the constants for free to figure out what the constants have to be in order to make this work so that's the general idea see a few examples of this actually the same example several times okay unfortunately this is what you might call a I don't know this is an algorithm but it uses an oracle which is knowing the right answer but sometimes it's not too hard to guess the answer um depends so if you look at this recurrence T of n is 4 * T of n / 2 + n let's we should implicitly we always have some base case of T of some constant usually one is a constant so we don't really care about the case for algorithms that's always the case and we want to solve this thing does anyone have a guess what the solution is ideally someone who doesn't already know how to solve this recurrence okay how many people know how to solve this recurrence a few okay and of the rest any guesses so you have if you look at what's going on here if you here you have t of n /2 Let's ignore this term more or less um we have n/2 here if we double n and get T of n then we multiply the value by four and then there's this additive n but that doesn't matter so much so what function do you know that when you double the argument the output goes up by factor four sorry n squ yeah so you should think n s and you'd be right okay but we won't prove n s yet let's prove something simpler because it turns out proving that it's at most n s is a bit of a pain we'll see that in just a few minutes but let's guess that t of n is n cubed first because that will be easier to prove by induction so you sort of see how it's done in the easy case and then we'll actually get the right answer N squared later so I need to prove what I'm going to do is guess that t of n is some constant times n cubed at most so I'll be a little more precise I I can't use the Big O notation in the substitution method so I have to expand it out to use constants I'll show you why in a in a little bit um let me just tell you at a high level what's important in not using Big O notation so Big O notation is great uh if you have a SE a finite chain of Big O relations you know n s is Big O of n Cub as Big O as n the 4th is Big O of n 4th is Big O of n 4th that's all true and so you get that n s is Big O of n 4th but if you have an infinite chain of those relations then the first thing is not Big O of the last thing you have to be very careful for example right this is a total aside on the lecture notes suppose you want to prove that n equals order one this is a great relation if it were true every algorithm would have constant running time so this is not true that's right not in Wayne's World notation um so you could prove this by induction you could prove this by induction by saying well base case is 1 equals order 1 okay that's true okay and then the induction step is well if I know that n minus one so let's suppose that n minus one is order one well that implies that n which is n -1 + 1 well that if this is order one and this one is order one so the whole thing is order one and and that's true if you knew that n minus one was order one and one is order one then their sum is also order one but this is a false proof you can't you can't induct over Big O's what's going on here is that the constants that are implicit in here are changing here you have some big O of one here you have some big O of one you're probably doubling the constant in there every time you do this relation if you have a finite number of doublings of constants no big deal it's just a constant two to the power number of doublings but here you're doing n doublings and that's no good so the constant is getting the constant is now depending on N so we're avoiding this kind of Problem by writing out the constant we have to make sure that constant doesn't change so good okay so now I've written out the constant I should be safe is what I need to prove this is I'm assuming it for all K less than n and now I have to prove it for k equal to n so I'm going to take T of N and just expand it okay I'm going to do the obvious thing I have this recurrence how to expand T of n then it involves T of N /2 and I know some fact about t of n /2 because n /2 is less than n so let's expand T of n is 4 * T of n / 2 + n and now I have an upper bound on this thing from the induction hypothesis so this is at most 4 * C * the argument cubed plus plus n oh boy I have a feeling by the end of the lecture these blackboards will be totally stuck oh boy oh it's not going up okay it's okay I only need half a board I'd like to say it's from my superhuman strength but I believe it's from the quality of the blackboards um or their hinges okay so continuing on here let's expand this a little bit so we have four okay so we have n uh we have n cubed over 2 cubed 2 cubed is eight so four over8 is a half so we have a half time C * n cubed + n and what I'd like this to be is so at the bottom where I'd like to go is that this is at most C * n cubed that's what I'd like to prove to establish reestablish the induction hypothesis for m so what I'll do in order to see when that's the case is just write this as what I want so this is sort of the desired value C * n cubed minus whatever I don't want so this is called the residual so uh now I have to actually figure this out see we have CN cubed but only half CN cubed here so I need to subtract off half CN cubed to get that lead term correct and then I have plus n and there's a minus here so it's minus n okay and that's the residual in order for this to be at most this I need that the residual is non- negative so this is if um the residual part is greater than or equal to zero which is pretty easy to do because here I have control over C I get to pick C to be whatever I want and as long as C is at least I don't know two then this is a a one at least and I have n cubed should be equal to n and that's always the case so this is for example boy this is true if C is at least one and I don't think it matters what n is but let's say n is at least one Just for kicks okay so we get uh what we've done is proed that t of n is at most some constant time n cubed and the constant is like one uh so that's an upper bound it's not a tight upper bound we actually believe that it's n^2 and it is but you can I mean so you have to be a little careful this does not mean the answer is n cubed it just means that it's at most n Cub it's Big O of n cubed and this is a proof by induction now technically I should have put a base case in this induction so there's a little bit missing the base case is pretty easy because T of one is some constant uh but it will sort of influence things so if the base case T of 1 is some constant and what we need is that it's at most C * 1 cubed which is C and that will be true as long as you choose C to be sufficiently large so this is true if C is chosen sufficiently large now we don't care about constants but the point is just to be a little bit careful uh it's not true that t of n is at most 1 * n^2 even though here all we need is that c is at least one for the base case to work C actually might have to be 100 or whatever T of 1 is okay so be a little bit careful there doesn't really affect the answer usually it won't but because we have very simple base cases here okay so let's try to prove the tight bound of order n s I'm not going to prove an Omega bound but you could prove an Omega n squ bound as well using substitution method I'll just be satisfied for now proving a an upper bound of n^2 so let's try to prove that t of n this is the same recurrence I want to prove that it's order n s so I'm going to do the same thing and I'll write a bit faster because this is basically copying except now instead of three I have two so then I have t of n is 4 * T of n / 2 plus n i expand this T of n /2 this is at most 4 * C * n / 2^ 2 + n and now instead of having 2 cubed I have 2^2 which is only four fours cancel I get C * n^ 2+ n and if you prefer to write it as desired minus residual then I have CN ^2 minus n and what I need is that I want this to be non- negative and it's damn hard for minus n to be non- negative if n is zero we're happy but unfortunately this is an induction on N it's got to hold for all n greater than or equal to one um so this is not less than or equal to cn2 notice the Temptation is to write that this equals Big O of n^2 which is true for this one step Okay C * n^2 minus minus n well I mean these are both order n or this is order n this is order n s certainly this thing is order n^ s that's true but it's not completing the induction to complete the induction you have to prove the induction hypothesis for n with this constant C here you're getting a constant C of like C+ one which is not not good so this is true but um useless okay it does not finish the induction so you can sort of ignore that this proof doesn't work which is kind of annoying because we feel in our heart of hearts that t of n is n^ s um it turns out to fix this you need to know you need to express T of N is a slightly different form this is again divine inspiration and if you have a good connection to some Divinity you're all set but uh it's a little bit harder and for the rest of us mere mortals turns out and and maybe you could guess this um so the idea is is we want to strengthen the induction hypothesis so we assumed this relatively weak thing T of K is let most some constant time K squ we didn't know what the constant is that's fine but we assumed that there were no lower order terms I want to look at lower order terms okay maybe they play a role and if you look at this um progression you say oh well I'm getting something like n squ and the constants are pretty damn tight I mean the fours are canceling the C just is preserved how am I going to get rid of this lower order term plus n well maybe I could subtract off a linear term in here and if I'm lucky it'll cancel with this one okay that's all the intuition we have at this point turns out it works so we look at T of n this this is 4 * T of n / 2 + n as usual now we expand a slightly Messier form we have C1 * n / 2^ 2us C2 * n /2 plus n okay this part is the same the fours canel again so we get C1 * n^ 2 which is good I mean that's sort of the form we want then we have something * n so let's figure it out we have a plus 1 * n so let's write it 1 * n minus C2 over 2 * n oh no got that wrong so there's a four time a two so in fact the two is upstairs check right okay so now we can write this as desired minus residual and we have to be a little careful here because now we have a stronger induction hypothesis to prove we don't just need it's atmost C1 * n^2 which would be fine here because we can choose C2 to be large but what we really need is C1 n^ 2 minus C2 * n and then minus some other stuff so this is again desired minus residual and minus residual Let's see we have a minus one and we have a minus C2 um doesn't look so happy C2 minus one plus C2 thank you that again looked awfully negative yeah it's plus C2 I'm getting my sign errors there's a minus here and there's one minus here so there we go so again I want my residual to be greater than or equal to zero and if I have that I'll be all set in in making this inductive argument so office hours start this week case you're eager to go they're all held in some room in Building 24 which is roughly the midpoint between here and sta I think for no particular reason um good and you can look at the web page for details on the office hours so continuing along we want when is C2 -1 going to be greater than equal to zero uh well that's true if uh C2 is at least one which is no big deal again we get to choose the constants however we want it only has to hold for some choice of constants so we can set C2 greater than equal to one um so then we're happy that means so uh so this whole thing is less than or equal to C1 n^ 2us c2n if C2 is greater than equal to 1 so it's kind of funny here we have uh this we've proved now this proves the finishes the induction at least the induction step for any value of C1 and provided C2 is at least one it's a little we have to be a little more careful if C1 does actually have to be sufficiently large any particular reason why C1 better not be negative indeed C1 has to be positive for this to work uh but it even has to be larger than positive depending soorry I've been going so fast I haven't asked you questions now you're caught off guard yeah basee because of the base case exactly so the base case we'll have t of one is you know um C1 * 1^ 2 minus C2 and uh by this or we want to prove that it's at most this and T1 T of1 is some constant we've assumed uh so we need to choose C1 to be sufficiently larger than C2 in fact so C2 has to be at least one C1 may have to be at least 100 more than one uh if this is 100 so this um so this will be true if C1 is sufficiently large and sufficiently large now means with respect to C2 so have to be a little bit careful but in this case it doesn't matter any questions about the substitution method that was the same example three times in the end it turned out we got the right answer uh but we sort of had to know the answer in order to find it which is a bit of a pain certainly be nicer to just figure out the answer by some procedure and that will be the next two techniques we talk about sorry how L how would you prove a lower bound um I haven't tried it for this recurrence but you should be able to do exactly the same form uh argue that t of n is greater than or equal to um C1 * T n^ 2 minus C2 * N I didn't check whether that particular form will work but I think it does so try it um these other methods will give you in some sense upper and lower bounds if you're a little bit careful uh but to really check things you pretty much have to do the substitution method and you'll get some practice with that usually we only care about upper bounds so big I mean proving upper bounds like this is what we'll focus on but occasionally we need lower bounds it's always nice to know that you have the right answer by proving a matching lower bounds so the next method we'll talk about is the recursion tree method and it's a particular way of of adding up a recurrence and it's a it's my favorite way it's particularly uh it it usually just works that's the great thing about it it provides you intuition for free it tells you what the answer is pretty much it's slightly nonrigorous this is a bit of a pain so you have to be really careful when you apply it otherwise you might get the wrong answer um because it involves dot dot dots our favorite three characters um but dot dot dots are always a little bit non rigorous so be careful uh technically what you should do is find out what the answer is with the recursion tree method then prove that it's actually right with the substitution method um usually that's not necessary but you should at least have in your mind that that is required rigorously and probably the first few recurrences you solved you should do it that way when you really understand the recursion tree method you can be a little bit more sloppy if you're really sure you have the right answer so let's do an example we saw recursion trees very briefly last time time uh with merge sword as the intuition why it was n log n and this I mean if you took an example like the one we just did with the recursion tree method it's dead simple so let just to make our life harder let's do a more complicated recursion so here we imagine we have some algorithm starts with a problem of size n it recursively solves a problem of size n over 4 it then recursively solves a problem of size n /2 and it does n^ s work on the side with without non-recursive work so what is that I mean that's a bit less obvious I would say um so what we're going to do is draw a picture of how we're just going to expand out that recursion in tree form and then just add everything up so we want to the general picture and the the general principle in recursion tree method is we just draw this as a picture we say well T of n equals um the sum of n^2 T of n / 4 and T of n /2 so I mean this is a weird way of writing a sum but why not write it that way okay this is going to be a tree um and it's going to be a retrieve by expand by recursively expanding each of these two leaves so I start by expanding T ofn to this then I keep expanding expanding expanding everything so let's go one more step um we have this N2 T of n over 4 T of n /2 if we expand one more time this is going to be n^ 2 plus two things the first thing is going to be n/4 squar second thing is going to be n / 2^ 2 plus their recursive branches so we have t of n/ 16 and T of n over 8 here my arithmetic shows thin uh this better be the same T of n over 8 and this should be T of N over4 I believe okay you just keep going forever I mean until you get down to the base case where T is a constant okay so I'm I'm going to now skip some steps and say dot dot dot this is where you have to be careful so we have n^2 n/4 2ar n / 2^ 2 now this is easy because I've already done them all n / 16 squar n/ 8 SAR n/ 8 2 again n over 4 2ar and Etc dot dot dot various levels of recursion here at the bottom we're going to get a bunch of constants these are the leaves okay uh I'd like to know how many leaves there are uh so that's one challenge is how many leaves in this tree could there be this is a bit subtle unlike merge sword or unlike the previous recurrence we solved the number of leaves here is a bit funny because we're recursing at different speeds this tree is going to be much smaller than this tree it's going to have smaller depth because it's uh it's already gone down n over 16 here it's only gone down to n/4 but how many leaves are there in this recursion tree I need I only all I need is an upper bound some reasonable upper bound I can tell you it's at most n to the 10th but that's a bit unreasonable should be less than n good why is it less than n exactly so I I start with a problem of size n and I recurse into a problem size n over4 and a problem of size n /2 when I get down to one I stop so n/ 4 plus n /2 is 3/4 n which is strictly less than n so definitely uh the total number of leaves has to be at most n because I start out with n sort of stuff and I I get rid of a quarter of it and then recurse and so it's definitely going to be less than N Stuff at the bottom so strictly less than n Le okay at this point I've done nothing interesting and then the the second cool idea in recursion trees is you don't just expand this tree and see what it looks like and then say well God how the hell am I going to sum that you sum it level by level that's the only other idea it usually works really really well okay here it's a bit complicated I have to think a bit to figure out n^2 is N2 that's first level okay easy uh second level I to think a lot harder cuz my you know there are three kinds of mathematicians those who can add and those who can't and I'm the latter kind uh so I need your help um this what can you add these things together it's n s over something please oh one five over 16 good thanks 5 over 16 n squar okay now now I really need your help I think that one I could have done but this is a little bit harder I'll go look at my notes while you compute that where did I any answers 7 over 256 73 over 256 anyone else confirm that that seems a bit High to me 73 doesn't sound right I I could cheat 64 64 uh closer it's actually important that we get this right the 256 is correct I can tell 256 everyone should know 16 squ is 256 we're computer scientists 25 good we have two people saying 25 therefore it's correct by democracy 25 is also what my notes said and I I computed it at home 25 is the right answer now anyone notice something magical about this progression it's squares each time good which and if we were going to add these up you might call it a geometric series very good so it turns out this is geometric and we know how to sum geometric series at least you should so let's see almost so we start at n^2 we know that at the bottom we well this is not quite a level we get something like n but we're decreasing geometrically okay so our the total is sort of the total I mean we know the solution of the recurrence is the sum of all the numbers in this tree so if we add it up level by level and then add up all the levels that's going to give us the answer so this is the total computed level by level it's just a cute way to compute it usually gives you nice answers like geometric answers so we have 1 * n^2 + 516 * n^2 and if we believe in faith and we see this three number recurrence we know that we have the right answer in general it's going to be you know 5 to the^ K over 16 to ^ k at least we hope and so on and you know it keeps going uh it doesn't go on infinitely but let's just assume it goes on infinitely that will be an upper bound it goes on forever uh this is alltimes n^2 okay now if you're going to know one thing about geometric series you should know that 1 plus a half plus a quarter if you sum all the powers of two you get two good okay we're computer scientists we've got to know at least the binary case this is like writing 0.1111 1111 in binary actually 1.111 and 1111 forever is the same as one so this is two this is even smaller we have 516 that's less than a half and then we're squaring each time so this is even less than two okay if you want there's a Nifty formula for solving the general geometric series but all we need is that it's a constant so this is order n^ s okay it's also Omega n^2 it's pretty obvious that it's Omega n s because the top thing is N2 so there's our lower bound of n^2 and we have it within a factor of two which pretty good you actually get a better Factor here so that's recursion tree method it's a little shaky here because we have these dot dot dots and we just believe that it's geometric turns out most of the time it's geometric no problem here I would definitely check it with a with the substitution method because this is not obvious to me that it's going to be geometric uh in the cases we'll look at in a moment it will be much clearer so clear that we can State a theorem that everything is working fine okay and still time good so that was recursion trees there's one more method we're going to talk about and you could essentially think of it as an application of the recursion tree method but it's made more precise and it's it's an actual theorem whereas recursion trees you better if the dot dot dots aren't obvious you better check them the sad part about the master method is it's it's pretty restrictive it only applies to particular family of recurrences so it should be T of n equal a some constant a * T of n / B plus some function of n we're going to call it f yes I'll call it f so in particular it will not cover the recurrence I just solved because I was recursing on two different problems of different sizes here every problem you recurse on should be of the same size there's a sub problems I me think of this as a recursive algorithm you have a sub problems each of them is of size n/ B so the total cost will be this then you're doing F of n non-recursive work few constraints a should be at least one it should have at least one recursion uh B should be strictly greater than one you better make the problem smaller or else it's going to be infinity and uh f f should have some nice property F ofn should be ASM totically positive so how many people know what asmt totically positive means no one okay you haven't read the textbook um that's okay I haven't read it either although don't tell Charles um and he notice um and what might you think asmic positive means okay that we can do a little bit better sorry yes it means for large enough n f ofn is positive that's this means F of n is greater than zero for n at least some n so for some constant n So eventually it should be positive I mean we don't care about whether it's negative 1 for n equals 1 not a big deal it won't affect the answer because we only care about the astics with n good so the master method you give it a recurrence of this form it tells you the answer that's the great thing about the master method the annoying thing about the master method is it has three cases it's a bit long you you it takes a little bit longer to memorize than all the others because the others are just ideas here we need to actually remember a few things so let me State the theorem well not quite yet there is one very simple idea which is we're going to compare this non-recursive work F of n with a very particular function n to the log base B of A okay y end to the log base B of A you'll see later um turns out it is the number of leaves in the recursion Tre but we'll that's the foreshadowing so it's either less equal or bigger and here we care about astics and so it's we have to be a little bit more precise about less equal or bigger you might think well it means little o Biga or little Omega and it would be nice if the theorem held for all of those cases but it leaves a bit some gaps okay so let's start with case one case one is when f is smaller and not just that it's little o but it's actually quite a bit smaller it's got to be polom smaller than n to the log base B of A so for some positive Epsilon the running time should be this n to this constant log base B of A minus that Epsilon so it's really polom smaller than n to the log base B of A uh we can't handle the little o case that's a little bit too strong this is saying it's really quite a bit smaller okay but the answer then is really simple T of n is Theta n the log base B of A great that's case one case two is when F of n is pretty much equal to n to the log base B of A and by pretty much equal I mean up to poly log factors so this is log base 2 of n to the^ K should know this notation so for example K could be zero and then they're they're equal up to constant factors this for some K greater than or equal to zero um less than will not work so it's really important that K is non- negative should probably be an integer now it doesn't actually matter whether it's an integer um but there it is could be n log Bas B of A Time log n or just times nothing whatever okay again solution is easy here T of n is n to the log base B of A times presumably has to be at least Times log K turns out it's log to the k + 1 of n that's case two we have one more case which is slightly more complicated we need to assume slightly more for case three but case three is roughly when F of n grows bigger than n the log base B of A so it should be Capital Omega here's one place where we get to use Omega log base B of A Plus Epsilon for some positive Epsilon so it should grow not just bigger but polom bigger here it was growing just a log Factor bigger poly log here it's a polinomial factor here in this case we need another assumption about F because we worry a little bit about how quickly F grows we we need we want to make sure that as you go down the recursion F gets smaller be kind of nice if F gets smaller as you go down um otherwise you're again trying to sum to infinity or whatever um yeah I see this is some Epsilon Prime for some Epsilon Prime greater than zero so what I'd like is that if I just sort of take the recurrence this T of N and just throw in FS instead I get uh you know F of n should be somehow related to a * F of n over B what I'd like is that F ofn which is at the top of the recursion tree should be bigger than the thing at the next level down the sum of all the values at the next level down should be bigger by some constant Factor so here I have that the next level down is at most some one minus Epsilon something strictly less than one some constant strictly less than one times the thing at the top level okay I need that to make sure things are getting smaller as I go down then T of n is Theta F of N and that's the the theorem this is the master theorem or whatever you want to call it it's not named after some guy named Master it's just it's the master of all methods cuz it's very easy to apply so let's apply it a few times and it's a bit to bit much to take in all at once and then we'll I'll give you a sketch of the proof to see it's really not that surprising that this is true if you look at the recursion tree but first let's just just try using it so for example we could take T of n is 4 * T of n / 2 plus n okay this is a this is B this is f of n okay so the first thing we should compute is n to the log base B of A this I think even I can do log base 2 of four yeah log base 2 I can do this is n^2 okay so is f of n smaller or bigger than N2 well f ofn is n n^ s is clearly bigger by a polinomial factor so we're in case one so what's the answer n squ yeah it's Theta n to the log base B of A which is here just n^ 2 let's do some slight variations so I'm going to keep a and b the same and just change uh F so let's say t of n is 4 * T of n / 2 + n 2 okay this is like drill you know spelling so n^2 is is ASM totically the same as n^2 even up to constants uh so what's the answer this is case two slightly harder what's K in this example zero so the answer is survey says n squar log in good and couple more T of n is 4 * T of n / 2 + n cubed what's the answer n cubed this is case three okay I know this is pretty boring at this point we're just applying this stupid theorem how about n^2 / log n what's the answer good this case no one should answer it's a bit tricky I forget exactly the answer I think it's like uh n log log n over log n no oh no n squ log log n that's right yeah but you should know that this doesn't follow from the master method uh this is something you'd have to solve probably with a recursion tree would be a good way to do this one uh and you need to know some properties of logs to know how that goes but here the master method does not apply so you have to use a different method good okay the last thing I want to do is tell you why the master method is true and that makes it much more intuitive especially using recursion trees why everything works uh I shouldn't bring that down unless I can get this one up hey that one slides well that one doesn't so we're going to prove this is a sketch of a proof not the full thing you should read the proof in the textbook it's not that much harder than what I'll show but it's good for you to know the formal details I don't have time here to do all of the details just so tell you the Salient parts so this is the proof sketch or the intuition behind the master method so what we're going to do is just take the recursion tree for this recurrence and add up each level and then add up all the levels see what we get so we start with F ofn at the top after we've expanded one level then we get a different problems each of size n/ B and after we expand them it'll be F of n/ B for each one because they're all the same size then we expand all of those and so on we get another a sub problems from there going to get like f of n / B squared so that's sort of decreasing geometrically the size and so on and so on and so on until at the bottom we get constant size problems that's this is a bit special because this is the base case we have some other constant at the bottom we'd like to know how many leaves there are but uh that's a little bit tricky at the moment so let's first compute the height of this tree uh let me draw it over here so what is the height of this tree I start with a problem of size N I want to get down to a problem of size one how long does that take how many levels this is probably too easy for some and uh not at your fingertips for others log base B of n good the height of this tree is log base B of n cuz it's just just how many times I divide by B until I get down to one so that's great now I should be able to compute the number of leaves because I have branching factor a I have height H so number of leaves is a to the H A to the log Bas B of n uh so let me expand that a little bit a to the log base B of n properties of logs we can take the N downstairs and put the a upstairs and we get n to the log base B of A our good friend n to the log base B of A so that's why n to the log base B of A is so important in the master method what we're doing is comparing F which is the top level to n to the log base B of A which up to thetas is the bottom level right now the leaves are all at the same level because we're decreasing at the same rate in every Branch so if I add up the cost at the bottom level it is Theta n the log base B of A I add up the things at the top level it's F of n not not terribly exciting uh that the next level this is a little bit more interesting is a Time F of n / B which should look familiar if you had the master method already memorized it's that okay so we know that a * F of n / B has decreased by some constant Factor 1 minus Epsilon Prime so we've gone down this is a constant Factor smaller than this and then you sum up the next level it's going to be you know like a^ 2 * F of n / b^ 2 um yeah I see that I actually wrote this wrong parenthesis sorry about that it's not n over b^2 it's n over b^2 so uh this sequence is in case three at least it's decreasing geometrically if it's decreasing geometrically up to constant factors it's dominated by the the top by the biggest term which is f of n therefore in case three we get Theta F of n if uh let's look at the other cases and let me adapt those cases to how much time we have left wow lots of time five minutes tons of time okay what to do uh okay let's start let me write that down so case three the cost decrease now this is a place I would argue where the dott dot dot is pretty obvious right here this is damn simple as a k * F of n b the K and in case three we assume that the costs decrease um geometrically as we go down the tree okay that was sort of backwards to start with case three let's do case one which is sort of the the other intuitively easy case so in case one we know that f ofn is polom smaller than this thing and we're sort of changing by this very simple procedure in the middle I'm going to wave my hands and this is where you need a more formal argument I claim that this will increase geometrically it has to increase geometrically because this F of n is geometric is polom smaller than this one you're get various pols in the middle which interpolate geometrically from the small one to the big one therefore the big one dominates because it's again geometric series as I said this is intuition not a formal argument this one was pretty formal because we assumed it but here you need a bit more argument that they they may not increase geometrically but they could increase faster and that's also fine so in case three your dominated I mean you're always dominated by the big term by the biggest term in a geometric series here it happens to be F of N and here you're dominated by n to the log base B of A with the bottom term uh Theta okay case two little bit here it's pretty easy but you need to know some properties of logs in case two we assume that all of these are basically the same I mean we assume that the top is equal to the bottom and this is changing In This Very procedural way that therefore all of the ones in the middle have to be pretty much the same okay not quite because here we don't have the log Factor so here we have a log to the K we have n the log base B of A Time log to the k n here we don't have the log to the K so the logs do disappear here turns out the way they disappear is pretty slowly most most of these like if you look at the top half of these terms hope it's half they will all have log to the K and then the bottom half they'll start to disappear I'm talking about I'm giving you an some Oracle information if you take logs and you don't change the the argument by too much the logs remain maybe halfway is not is too far but the claim is that each level is roughly the same especially the uppermost levels are all ASM totically equal roughly the same and therefore the cost is one level which I here like f of n times old school uh times the number of levels H and H is log base B of n b is a constant so we don't care this is Theta log n and therefore we get that t of n is uh n to log Bas B of A Time log to the K Time Another Log n so we get F of n time log n that is the very quick sketch sorry I'm I'm being pretty fuzzy on cases one and two read the proof because it will really you'll have to at some point manipulate logs in that way and that's all any questions we all eager to go okay thanks see you Wednesday

Transcript for:6.046 - Lecture 2: Asymptotic Notation and Recurrences

Transcript for:
6.046 - Lecture 2: Asymptotic Notation and Recurrences