Transcript for:
Maximum Score from Removing Substrings

hey everyone welcome back and let's write some more neat code today so today let's solve the problem maximum score from removing substrings the idea is that we're given a string like this and we're also given two hardcoded strings each is of size two so a and ba a now the score associated with this string is four and the score associated with this string is five so these are the parameters these two and this these are not going to be given as parameters cuz like I said they're pretty much hardcoded so the idea is that by removing this substring from the input string so suppose from over here we gain a score of four by removing this string ba suppose from over here we gain a score of five we want to try to maximize the total score that we can get and then just return that score not the string after making like the removals just the score I'm going to kind of go through two different solutions to this problem the first one is going to be a linear time linear space solution the second one we're not really going to go in super depth it's more about how to optimize the space complexity which is kind of hard to do with this problem just because the nature of it it is a string and strings are immutable so anytime you want to make a modification to it it's going to be a linear space operation because you're going to have to create a new string but we will kind of talk about that because it's very similar to a leak code easy believe it or not cuz I think that kind of optimization is actually simple concept ctually so it's worth going over even getting into the first solution it is somewhat difficult so I'll be focusing on why does it work as we usually do I'll also be trying to guide you a little bit towards how you can try to arrive at this solution even though I'll be honest it's not easy like I can't just give you a one two three step book on how to exactly solve problems like this I wish it was that easy unfortunately it's not now getting into the problem I think it is somewhat obvious that the scores matter like at some point we're going to have a choice to either remove a and ba a I think that much is obvious just because I mean look at these two strings they're kind of the exact same like imagine if we had an input string that looked like this A B A and when you kind of look here we kind of do have that in this string so this is kind of where the choices come from we can possibly remove AB or we could remove ba obviously we want to do the one with the higher score which we don't necessarily know which one is that like we have to kind of determine that ourselves it's not always going to be this string and this example ba is the string that gives the higher score but it could be inverted this could be four this could be five in that case we'd favor this guy so obviously this starts to feel like a greedy problem and it actually is and the solution is actually very very simple if I showed it to you you'd think wow that's really it but that's not the hard part the hard part is well why does it work so I'm going to kind of get into that first of all first let me show you why a different solution won't work if we try to solve this problem the first observation you have to make is that both of these strings are of size two that's a very important observation to make let me tell you why imagine we have one pointer we're iterating over the input we want to know do we have a matching pair like if I'm over here I want to know do I have a matching pair all I need is the previous character to know if I have a matching pair so far right so that's easy that's great we just look at the previous character we saw is it the same okay great and now suppose we wanted to actually remove this how do we do it consider this consider we're adding all of these elements to some data structure as we go and now we're going to just skip adding this one we're not going to add it to that data structure and not only that but we're going to remove the previous character that we saw does this sound familiar to you does this sound like we're getting towards a data structure that you might know of yes it's called a stack that's how we solve this problem this is of size two we only remove the previous character when we want to remove one of these pairs this is another key observation you really have to make to solve this problem but even that is not enough because again the hard thing is we have choices and even more if I remove a pair like just look at this string right here here BB a a look at that string all I see right now is one pair in it right we don't have any occurrences of a this is the only pair I see now what happens when I remove it boom boom well the new string becomes ba now we have a new pair so the hard part about this problem is that removing a pair could potentially introduce new pairs so again another reason why the stack is necessary and now you might think you have a solution you might think this imagine we have some string like XYZ a b a x whatever right this is just a random string but watch this we go through it we get to a okay like this is added to the stack so far we get to be okay we see we now have a matching pair beautiful right we should remove it shouldn't we we get a score of four no not yet because before we try to do that let's just check okay I have a b right now if the next character is an a which it is and if the score of removing that type of pair is even greater than this one which it is then let's not do this let's not remove this pair let's skip it because by the time we get here we can remove that pair so basically you might think that there's a way to solve this problem by iterating through it a single time and just trying to prioritize whichever pair pair that has the higher score that gives us right so whenever we come across a choice like a decision we favor the one with the higher score you might think this will work let me give you a counter example and it's literally the second example in this problem so I just have a b here to show that it's the one that Associates with this and ba is the one with this so the answer to this problem is 20 how do we get there well look right over here when we're over here like when we're at this B we're going to then get to a we're going to see okay we have this string right here ba a we see that it's the one that gives us a lower score so maybe we can form a better string with this a maybe we can form a nope the next character is also a so then you might say okay well then we have no choice but to pop this and remove the string ba a even though it gives us a lower score nope you made a mistake let me show you the better way to do it look these two are fine as is get to the second a then remove a here the beautiful thing about that is then we kind of get rid of those and then we can join this a with this B over here let me make the string just a little bit bigger this portion is what we're specifically focusing on we have the choice of a ba but that's not what we want we want the ab get rid of this and now we have another one another AB get rid of that again that gave us a score of 10 if we did not do that this is what what we would have had we would have had the ba okay and then we would have had the ab that would have been a nine so clearly the approach of just looking at the element and comparing it with the previous one and potentially looking at the next element is not going to work the approach that will work is this a two-phase approach first remove all occurrences of the one that gives us a higher score so go through this input string which I'll quickly draw and only remove move instances of a b so let's iterate we have an A we have an A we have a b okay compare it to the previous one it's an a get rid of these two great now we're at B check what's at the top of our stack it's an a get rid of these two now we're at a a we keep going anytime we have like non-ab characters notice that these are characters that will never be removed therefore any character on this side of the wall and any character on this side of the wall will not be able to join each other I feel like there's a political analogy somewhere in here but I'll try to uh skip that we're at B now and we're going to keep going B A well there is a pair right but remember this is not the pair we're looking for we're only looking for abs right now so again just skip it now we're here now we're here remove this since it's a matching pair now we're here check the top of the stack remove this as well so now we removed four pairs each gave a score of five now the second phase of the algorithm would be removing all occurrences of the other type of pair now in this that does not exist but I can show you an example where they could have existed we could have had an extra B over here and there's no way that we would have been able to get rid of this because there's multiple consecutive A's over here we deleted everything over here and this is like a wall like I kind of said so this would have remained and in the second phase of the algorithm we would have gotten rid of this alternatively or in addition there also could have been an extra a over here we removed this portion this is the wall so no way this would have gotten deleted and this would only be deleted in the second portion of the algorithm where we would gain a score of four from it okay so now if this makes sense to you that's great you can start coding it up but there's one question you might have forgotten to ask me how do you know that after the second phase of the algorithm by removing some ba pairs how do you ensure with 100% certainty that by removing these in the second phase of the algorithm we aren't introducing any a b pairs because it's okay if we introduce an extra ba pair in the second phase of the algorithm that would look something like this where we had a two A's over here so imagine you know something like this this would not have gotten deleted in the first phase of the algorithm and in the second phase of the algorithm it would we would have deleted this and then we would have found this matching pair and deleted it as well in the second phase but how do you know for sure that by doing that we're not introducing any pairs of the first type because if we are then we better delete those two cuz that's the only way we can maximize the score so how do you know well the answer is some examples let me show you let's consider consider these two strings and really dig deep consider if this was the first phase as it was when I went through this example there's a few possibilities we remove these two characters but what's on either side of them now suppose it's like a non-ab character that's very easy because of course that's not going to introduce a new pair if even one side has a non-ab character it's not going to introduce a new pair I mean how could it possibly do that right I don't think I have to probably go in depth but if you have questions about that feel free to make a comment like I said any non-ab character is practically a wall so now we realize that we only care about the cases where there's ABS on both sides now there could be an A on each side there could be a B on each side or it could be something like this a or ba a I think it goes without saying that in the first two cases where obviously not creating new pairs isn't that right okay so let's go to the second two cases hypothetically speaking if we had something that looks like this then we delete this we obviously created a new pair AB B of the same type right so we know with 100% certainty that that new pair that we just created is also going to get deleted so that's another simple case right we don't even worry about that you saw that happen right over here so go ahead and delete this forget about it okay but then the other Case by deleting an AB we created a new ba a pair is that a problem for us not really this is the first phase of the algorithm like I said if we create a new ba that's perfectly fine that's going to get deleted in the second phase of the algorithm right so we just went through every single possibility and I Prov to you there's nothing wrong with deleting these in the first phase of the algorithm so go ahead and forget about all the cases now I'm going to run through the exact same over here and I'm about to blow your mind going through four cases remember that the first two are pretty trivial so we kind of just skipped those so now look at the next two and pay very close attention first I'm going to go through the last one over here imagine we delete a ba that's perfectly fine because we're in the second phase of the algorithm where we're deleting all Bas I delete the first one okay I just created a second one who cares delete that too right so nothing wrong with that now watch this this is the case we want to avoid oh my God we deleted a ba and we somehow created an AB pair but oh no the first phase of the algorithm is already over we created a boo boo nope let me tell you something there's a concept in mathematics called proof by contradiction I've been wanting to make a course on this for so long I think I mentioned it at least a year ago but there is a concept called proof by contradiction and I just did it for you look at this example I just showed you how could it ever be possible that we have a substring in the input that looks like this didn't I say we deleted all AB pairs so you tell me how the heck did we get an AB pair in the second phase of the algorithm how did we have a string that looks like this to begin with if we deleted all of them this is a proof by contradiction this case will never exist therefore we only need two phases of the algorithm which will guaranteed to delete all a and ba Pairs and if this seems complicated to you I just want you to know that this is the hard part about greedy algorithms implementing them is usually very easy proving why they work is the difficult part but almost always you can use the technique called proof by contradiction to do it I really hope I get the chance to make a course on discret math for you guys and let me know if you'd be interested in seeing it so now let's code up the solution to this problem it's going to be linear time and linear space because we're using a stack okay so these are kind of the notes that I took when I was solving the problem they may or may not be helpful for you but I hope that the explanation that I just gave was actually helpful um okay I have the code here let me delete that as well sorry about that but the idea is that if we're going to do this algorithm in two phases we should probably create a helper function for us which like I'm going to call remove Pairs and so there's two things we want to keep track of what's the pair we're trying to remove that's either going to be a or ba a and what's the score associated with that pair once we have that this is how we can use that function just assume that this is going to remove the pairs from that string and it's going to update s and the way it's going to do that is by uh I'm going to declare it as a non-local variable something like this but if you wanted to I think you could also do something like this like self. s is equal to S like set it as a variable for this object or class or whatever but I'll do it like the non-local way it doesn't really matter how you do it and this is going to return the score we got by removing those pairs and then it's going to update the string in place assume that that function works this is how we're going to use it we're going to have our result initially it's going to be zero this is what we're trying to return but we want to update it we want to do something like this like we want to add to it the function call and we want to do like ab and whatever score is associated with ab and we want to do the same thing for BA and the score associated with that the only problem is we don't know what order we want to call these in if this score is higher we want to do this first and this second but if this score is higher we want to do this one first how do we know which score is higher well for that we have if statements don't we so I'm going to do something clever I'm going to say the pair is equal to a b if x is greater than y otherwise I'm going to set it to ba a and so this pair is what I'm going to pass in the first time okay Okay but well what score should we do do we need another if statement for that if you like if statements go ahead but I like to keep things concise so I'm going to go ahead and use max of X and Y we already know we found the max here so this is the one that has the max in the second part we can do the Min of XY that's straightforward but what do we do for the pair well just the opposite of whatever this was AKA reverse it so something like this it's not really expensive to do this reversal down here because it's a string of length too okay now we've done the hard part believe it or not this part is going to be not too bad I'm going to have a variable here which is going to keep track of the score um actually score is a parameter here so I'm going to just call it result again just so you know this result is a different variable from the one here they have a different scope so just want to mention that and I'm going to have my stack that I was talking about earlier I'm going to go through every character in the input we only care if this character is the second character in the pair because if it is if it's the second character in the pair then we want to ask the question okay is the stack nonempty and is the top of the stack equal to the first character in the pair that we're looking for if it is then do something like this stack. pop and update the score increment it by whatever Delta that we passed in whatever one of these we passed in to the score here add that to the result here now if that's not the case if we're not popping from the stack then we're pushing to the stack so here I'm going to do stack. append the character and then at the end we want to return the result uh zooming out a bit but don't forget we do want to actually modify the string that's why I declared it as non-local here so how do we do that what's the easiest way to do that well you don't want to remove from a string so we can just use the stack that we were keeping track of and then right at the end do something like this the delimiter is going to be an empty string we want to join all the strings in the stack together so you can do that like this in Python and I think we can then set s equal to that and this will actually update the S that's out here that's why I declared it non-local and that's very important because when we execute the second phase of the algorithm we want the string to be updated so this is the whole code let's run it to make sure that it works as you can see it does it's pretty efficient now one last thing I wanted to mention the more efficient space solution and I won't code it up I just wanted to kind of talk about it conceptually because we can't really code it up in Python imagine this imagine that the input instead of being a string it was actually an array how would you solve this problem in constant space If instead of a string you were given an array so imagine that these are like numbers or something I know they're not I know they're literally characters but try to use your imagination well we could then do the removals in place how would we do it well I think it's called leak code problem 26 or something remove duplicates so in that problem like anytime you have adjacent duplicates so like I got an A here I have a second a they're right next to each other and I think in the context of that problem the duplicates were also adjacent like if they existed I think the input array was sorted or something like that maybe I'm forgetting but you can correct me if I'm wrong in the comments but if that were the case we do something like this okay this is an A okay well just move this a over here obviously the a still exists over here but we'd have two pointers we'd have this pointer telling us where to put the next character so this B would then go over here and then we move here then we'd shift this pointer again so it'd be kind of like that so now in the context of this problem let's say we're moving ABS we start here we have this a we have a now what do we do just take this thing and try to pretend like it's not there because what we're now going to do is we're going to have a pointer here and our pointer that was already iterating is now going to take every character we see and shift it to the pointer over there so now the B is going to go here every time we do that we're going to take this pointer and shift it to the right by one this would now start looking something like this all of these would be over here so this would be b a a x y and then this part would also be over here BB and then this a would also go over here and then this a well firstly we have taken the a over here put it there and then we'd see that we put the B here so now we would take that left pointer it was telling us where to put the next character now we'd shift it to the left by two so now when we see these two be's we're going to overwrite that pair that we just saw it would look something like this I think we deleted a couple pairs and I think I even forgot about one of them cuz now there's going to be an additional One On The Stack but I think you probably more or less get the idea at this point like we delete this pair we have shifted the pointer over here so we know now that the character is going to go here now we put that b over here we delete this pair again shift the pointer so that now we'd overwrite this part then we'd have like this last B go over here something like that and then we'd say everything at the end is chopped off that's how we can maintain the constant space because we are modifying the input instead of creating a new string or list that's the idea but obviously it doesn't apply to this problem since it's a string I'll leave things there if you found this helpful check out n code. for a lot more I got a lot of cool stuff coming soon thanks for watching and I'll see you soon