hi everyone I am Karan masu welcome to this video in this video we are going to look at the solution of the problem count number of substrings so first of all let us start by understanding the questions what does question say given a string of lowercase alphabets count all possible substrings not necessarily distinct that have exactly K distinct characters so basically we are given a string in which all the characters are lowercase alphabets that is from small a to small Z we need to count all the substrings that have exactly K distinct characters okay not more than K not less than K okay exactly K distinct characters we need to return the count of such substrings okay understand one thing if a character number of distinct characters means how many different characters are present so suppose I have a string a a a so the number of characters is three but the number of distinct character is only one that is a because it is repeated two more times okay so if we look at the example suppose ABA is the input string and K equals to two so we need to return the number of substrings which have exactly two distinct characters so one is a another is ba because both of them have two distinct characters A and B and the third one is a b a it also has two distinct characters A and B so the total number of such substring is three if we look at the second example suppose the string is a b a a c a and k equal to 1 it means we need to return the count of substring with exactly one distinct character so all the substrings which have one character Single Character substrings will have one distinct character that is a b a a c a a b a a c and a but there is also one more substring that is a a a a has two characters but number of distinct character is one that is k equals to 1 so a a is also counted in our answer there are no other substrings which have exactly one distinct character so the output is seven okay your task is to complete the function substring count which takes the string s and an integer K as input and Returns the number of substrings having exactly K distinct characters okay the expected time complexity is Big of modulus s that is length of string and expected auxiliary space is constant and the constraints are given here so now if we think about solving this problem then one of the Brute Force approach would be to calculate the number of distinct characters for all the substrings that is n into n + 1 by 2 if the length of the string is n then the total number of substrings is n into n + 1 by 2 so basically I count the number of distinct character for all these substrings okay and whenever the number of distinct character becomes equal to K I'll make increment my answer counter okay I'll start with answer equals to zero so basically generate all the substrings and count the number of distinct characters for all substrings and if whenever we get equals to K we'll increment the answer because we got a substring which is exactly K distinct characters right now this will take at least big of n Square time because there are n into n + 1 by two substrings even if we consider that counting of distinct characters is a constant time thing okay we will see how we can count number of distinct characters in a substring efficiently but even if we consider that it is a constant time work uh the time complexity would be big at least bigger of n Square because this many are the number of substrings right but the expected time complexity is bigger of n that is the length of the string so we need to think of something more efficient right so now what we will do here is see counting the number of subrings which have exactly K distinct characters is difficult than counting the number of substrings which have at least K distinct characters okay understand one thing so if I tell you to count uh the number of substrings which have at least K distinct characters then this would be easy work okay here what we need to count we need to count the number of substrings which have exactly K distinct characters okay but counting the number of substrings which have at least K distinct characters is an easy task how we will look at here okay but if this is an easy task then what we can do our answer would be count of sub strings number of subrings with at least K distinct characters minus count of substrings with at least k + one distinct characters did you got this point I want to find the number of substrings with exactly K distinct characters okay and what I said Counting of substrings with at least K disting characters is easy work so what I can say number of substrings with at least k distinct characters so what will be the substrings which I will get get in it I will get the count of the substrings which have K distinct characters plus count of substrings which have k + 1 distinct characters plus count of substrings which have k+ 2 distinct characters and so on if from this I subtract the number of substrings having at least k + one distinct characters so I'll subtract the count of substrings having at least k + one distinct character so what would it be at least k + 1 distinct characters plus k + 2 distinct characters plus count of substrings with k + 3 distinct characters and so on so basically this all part will get cancelled so I'll get the count of substrings with exactly K distinct characters I'll get my answer so now we need to look how we can count the number of substrings with at least K distinct characters okay okay now understand one thing suppose this is the string that you are given okay and you are considering this substring at present starting with l and ending with R okay now listen to my words carefully starting from index l r is the first ending index if I consider a subring starting with index l r is the first index where the number of distinct characters is greater than equal to K okay or it is equal to K so what I'm trying to say is that number of distinct characters from L to L from L to L + 1 up till from L to rus1 the number of distinct characters in this is less than K the number of distinct characters from L to R equals to K so R is the first ending index for starting index L where the number of distinct characters in this substring is ke okay so if starting with index l r is the first first index such that this substring has K distinct characters then can I say the substrings starting with l and ending at R have more than equal to K distinct characters yes can I say subring starting at index L and ending at index r + one will have more than K distinct characters yes because if this subring starting with l and ending at R has greater than equal to K distinct characters adding one more character to the subring that is R +1 index character will either REM the number of distinct characters will either remain same or the number of distinct characters will increase right so the number of distinct characters from L to R +1 is also greater than equal to K the number of distinct characters from L to R +2 is also greater than equal to K and so on the number of distinct characters from L to n minus one is also greater than equal to K so starting with index L these are the sub strings which have at least K distinct characters starting with index L these many are the sub strings which have at least K distinct characters right so can I say if starting with index l r is the first index such that this substring has greater than equal to K distinct characters then starting with l the number of substrings with at least K distinct characters is how much the ending index can be r it can be r + 1 it can be R +2 up to it can go to n minus1 okay so it would be n -1 - r + 1 that is equal to nus R did you got my point starting with index L the number of substrings whose distinct character count is greater than equal to K is n minus r and the substrings are l2r l2r + 1 L2 R +2 up to L2 n minus1 okay okay now so we have got starting with index L we got the count of the substrings with at least K distinct characters okay okay now understand one thing if starting with index l r is the first index such that the count of the distinct characters in this substring is k then starting with index L + one the first index the first ending index where the number of distinct characters will be greater than equal to K will be either or more than R if the if the substring from L to R has at least K distinct characters but R is the first ending index to satisfy this what does that mean that the substring from L to R minus one had less than K distinct characters then can I also say the substring starting with l + 1 and ending at R minus one will also have less than K distinct characters because if the substring starting with l and ending at r - one has less than K distinct characters and if I remove one character at lth index then my substring becomes starting with l+1 and end get R minus one this will also have less than K distinct characters right so the so if we start at l+ one index then the first ending index at which my count of distinct characters will be greater than equal to K will be either r or after R it cannot be before R did you got my point so we did two things first of all we understand this that starting from L if R is the first index where count of distinct character is k then starting with l+ 1 the first ending index will be either r or greater than R such that the count of distinct character is equal to K another think is if starting from l r is the first index where the number of distinct characters is greater than equal to k then starting with index L the number of substrings whose count of distinct characters will be at least K will be nus R okay so now what we can do is here L was this okay so we can vary our starting point from 0 to n minus one and for every index considering that as the starting index we can count the number of substrings starting with that index Lal to 0 L = to 1 Lal to 2 and so on we can count the number of substrings which will have at least K number of characters at least k number of distinct characters and we will add them up okay and that will give us the number of substrings with at least K different characters similarly considering the starting index as 0o then one then two weing our L from 0 1 2 3 and so on we can find the number of substrings which have at least k + one characters and add them up that will give us the total number of substrings with at least k + 1 characters and we will subtract them right and this will give us the total number of substrings with exactly K distinct characters okay now let's understand that how starting with index L we can efficiently count the number of distinct characters so now we said that we will uh keep changing our starting index that is L we will take it from 0 1 2 3 and so on and uh for every index considering at starting index we will count the number of uh substrings that is at least K distinct characters okay so what we can do is we can start with L = to R = to 0 okay so if this is my string this is my index zero this is my first starting index okay so suppose starting with this index I'll keep incrementing R I'll keep incrementing R until I do not get the count of distinct characters equals to K once I get it I know that now the number of substrings suppose this is R then starting with Lal to 0 there are n minus r number of substrings with uh disting characters count greater than equal to K right so I'll add n minus r to my answer okay now I'll take L = to 1 right now I want the count of substring starting with lals to 1 and whose count of distinct characters is greater than equal to K which is at least K so my R would be either here or more than that so I will not take it backwards I'll keep incrementing R until I do not get the count of distinct characters equal to okay again so it might be possible that when we increment L from 0 to 1 we may need to increment r or we may not need to increment R okay starting with l equals to one at this index only at this R only we may get the count of distinct characters s k or we may need to add some more new characters because the character which was here may be missing right suppose here it was a then we had b b c c c so the count of distinct characters was three let's say k = to 3 now when L became 1 we are considering the substring this much only here the count of distinct character is two so I might need to add some more characters suppose here it is C and here it is a so now starting with Lal to 1 uh this would be my first ending index where the count of distinct characters would be three that is b c and a okay but R will not reduce so I'll keep incrementing L and I'll also for each L I'll also keep incrementing R until the count of distinct characters is K okay and then I'll add this to my answer now how will I how will I maintain the count of distinct characters so basically I'll take one array C of size 26 where C of 0 will represent the count of a in the current substring which I'm considering C of 1 will represent the count of B in the current substring and C of 25 will represent the count of small Zed in the current substring which I'm considering whenever I am incrementing R so I'm taking a new character into consider ation I'll do c of s of r - 97 ++ okay so basically if this is small a then the aski value of small a is 97 so 97 minus 97 will be 0o so count of Zer will increase okay and whenever I'll increment l so I'm going from L = to 0 to 1 so first I'll decrease the count of the character at zero index so count of s of L Min - 97 I'll do it minus minus and then I'll increment the L okay so the count of this character is reduced okay so this array will be maintained now whenever we decrement the count of any character if that becomes equal to zero it means that now it is no longer present in the substring so the number of dis distinct character will decrease by one I'll take one variable F to keep the count of distinct characters so whenever the count become zero after subtracting we'll do F minus minus that is the count of distinct character has reduced whenever we increment the count of a character and if that becomes equal to one It means that it is the first time introduced in this subring so it adds up to the new distinct characters it is introduced for the first time so it is a new distinct character so at that time I'll increment F okay and every time I'll check whether f is less than K or greater than K or equal to to K and similarly I'll change the value of L and R okay and we'll count the number of substrings starting with l which has at least K distinct characters and we move from L from 0 to n minus one okay and one more thing what would be the time complexity so basically the time complexity would be Big O of n why because our L will go from 0 to n minus 1 and R will go from 0 to n minus one at Max there would be two Loops but two is constant so I can say Big O of N and the auxiliary space the auxiliary space would be Big O of one because we are not using any new space here now let's look at its actual implementation so that if you have any doubts it will be cleared so now if we look at the actual implementation so as I said what we will do we will count the number of substrings with at least K distinct characters and subtract the count of substrings with at least k+ one distinct characters okay and that will give us the count of substring with exactly K distinct characters so what I have done is I have taken one variable answer initialized at zero n is the length of the string and answer equal to substrings with at least p number of disting characters this is the name of the function so it basically counts the number of substrings with at least P distinct characters okay and I have passed the string the length of the string and K in it minus the number of substrings with at least P distinct characters and I passed the string n and k + 1 so this will count the number of substrings with at least k distinct characters and this will count number of substrings with at least k+ one distinct characters and we subtract it and we get our answer and we'll return it okay now if we look at this function then this is uh the number of substrings with at least P distinct characters okay so what I have passed in it the string the length of the string and int P so what I have done is first of all I have taken one flag variable fals to 0 which will basically keep the count of number of distinct characters I have taken one C VAR C array and as I say c of0 will keep the count of small a c of 1 will keep the count of small B and so on okay uh then I have made a loop from 0 to 26 and made the count equal to zero uh instead of lnr I used lnr while explaining here I have taken I and J so I is the starting index of A substring and J is the ending index of a substring for I equals to 0 to n I'm changing the starting index while the number of distinct characters is less than than P what I'll do I'll keep incrementing J so I'll keep making the substring bigger and bigger so that my number of distinct characters becomes equal to P okay so while J is less than n and F is less than P increment the count of the J character okay and if that count equals to one It means that a new distinct character has come so we'll increment the count of F and then go to the next value of J if the number of distinct characters is greater than equal to P so it might be POS possible that J became equal to n but the number of distinct character is still less than P okay so if it is greater than equal to P then what I'll add to my answer I'll add count equals to count + n minus J + 1 okay while while explaining the concept I said we will add n minus r here we are adding n minus J + 1 right why because when you consider the J character so suppose this is the I character and this is the J character and for starting index i j is the first index where my number of distinct characters became equal to P okay but what is happening here is when I'm incrementing the count of the character at J and incrementing f after that I'm also incrementing the value of J right so even if J is the first ending index for I where the count of distinct character P when I come out of the loop my J will be equal to actually J + 1 okay so that's why I uh here I have done n minus J + 1 okay so this would be the number of substrings starting with index I where the count of distinct characters is at least P okay and then now my I will go to I + 1 the starting index will change so first of all I'll subtract the count of the character at I index and if that becomes equal to zero so that that character has finished so we will decrement the distinct characters and finally I'll return the count which is basically nothing but the number of substrings of string s where the number of dist string characters is at least P okay and uh in the function we subtracted it the number of distinct substrings having at least K distinct characters uh we subtracted the number of substrings having at least k+ one distinct characters and that gave us the answer okay so now let's submit this code so let's submit it so we have solved this problem successfully I hope you understood the solution completely thank you