Understanding SVM Classifier and Hyperplane

welcome back in this video i will discuss how to find the equation for hyperplane with the maximal margin using svm classifier for the given data set in the previous videos i have discussed svm classifier algorithm and i have solved many examples using svm classifier link for those videos is given in the description below in this case we have been given a data set with three input vectors and two features here given this data we need to apply the svm algorithm and then we need to find the hyperplane with the maximum margin here so in this case there are three input vectors are given to us hence n is equal to 3 here the first input vector x 1 bar is equivalent to 2 comma 2 second input vector is 4 comma 5 and third input vector is 7 comma 4 here and the target for the first input vector is minus 1 you can notice in this table y2 is plus 1 and y3 is plus 1. This is how the hyperplane equation looks like in SVM classifier. In this case, we need to find the values for weight vector and the bias because the input vector is known to us. Now, how to find the weight vector and the bias here? To calculate the weight vector and bias, we need to calculate the alpha vector. The alpha vector is nothing but the set of variables. based on the number of input vectors here. So, in this case, we have three input vectors. So, we need to calculate the three alpha values here that is nothing but alpha 1, alpha 2, alpha 3 here. While calculating alpha 1, alpha 2, alpha 3, we have to consider certain conditions that is summation of alpha i y i should be equivalent to 0 for all possible values of i here. as well as the value of alphas should be greater than 0 here. So, that is the one more condition we need to consider here. So, in the first step, we need to calculate the alpha values by maximizing the value of this particular function that is phi of alpha vector. Now, I will replace the value of n here, the n value is 3 here. So, I have replaced n is equivalent to 3 in this case. Now, if you look at this thing, in this case, what we need to do here is we need to expand this particular first equation, it will become alpha 1 plus alpha 2 plus alpha 3. Second time, it will become first time i is equal to 1 and j is equal to 1. Second time, it will be i equals to 1 and j equals to 2. Third time, it will be i equals to 1 and j equals to 2 and so on. So, those are the different combinations we need to consider. So, for that one, we can write it over here. But in this case, if you look at here, this is the dot product we need to calculate. so while calculating dot product if i is equal to 1 j is equal to 1 it will become x1 dot x2 here so that is what i have calculated and written here now how to get this x1 vector dot x2 x1 vector so look at here this is the x1 that is 2 comma 2 again x1 is 2 comma 2 here now how to calculate the dot product so this is the multiplication 2 into 2 is equal to 4 and this multiplication that is equivalent to 4 and we need to take the summation that is equivalent to 8 here similarly we need to calculate the other combination dot product that is i is equivalent to 1 and j is equivalent to 2 so that is what i have written here now how we got 18 here x1 vector dot x2 vector here so that is nothing but 2 into 4 plus 2 into 5 here 2 into 4 is 8 plus 2 into 5 that is 10 which is equivalent to 18 in this case similarly we need to calculate all possible combinations and once you calculate all these particular possible combinations we need to put the values over here we will get this equation that is phi of alpha vector is equivalent to alpha 1 plus alpha 2 plus alpha 3 that is this particular part similarly for the second part we will get 1 by 2 8 alpha 1 square plus 41 alpha 2 square plus 65 alpha 3 square and so on so once you get this equation What we need to do is we need to consider this constraint. What is the constraint here? Minus alpha 1 plus alpha 2 plus alpha 3 is equal to 0. That is nothing but alpha 1 is equal to alpha 2 plus alpha 3. Now, we need to take this constraint and then we need to simplify this equation. Once you simplify this equation, that is wherever there is alpha 1, I have replaced that with alpha 2 plus alpha 3 here. So, once I replace it, You can see here, this alpha 1 is replaced with alpha 2 plus alpha 3. So, alpha 2 plus alpha 3 plus alpha 2 plus alpha 3, that is nothing but 2 times alpha 2 plus alpha 3. Similarly, if I replace alpha 1 with alpha 2 plus alpha 3, I will be getting this particular part of the equation here. Now, what we need to do is, we need to simplify this equation. Again, it is a bit difficult to simplify this equation because we need to maximize the value of this function and we need to calculate the alpha 1, alpha 2, alpha 3 here. Now what we can do here is in this case, we can notice there are two variables are there alpha 2 and alpha 3. So we will differentiate this phi of alpha vector with respect to alpha 2 one time and then we will equate it to 0. Second time we will differentiate with respect to alpha 3 and then we will equate it to 0. and then we will calculate the values here now if i differentiate this equation with respect to alpha 2 you can notice here this one 2 times alpha 2 that will become 2 here and this is a constant and it will be replaced with 0 minus 1 by 2 13 alpha 2 square 13 alpha 2 square is nothing but what 2 into 13 into alpha 2 so that is nothing but 26 alpha 2 here plus this will be 32 alpha 3 that is 32 alpha 3 here and this will be 0 because 29 and alpha 3 will be constant here now if you simplify it this will become 26 alpha 2 divided by 2 is nothing but what 13 alpha 2 and 32 alpha 3 divided by 2 is nothing but what 16 alpha 3 here similarly once you differentiate this equation with respect to alpha 3 we will get this equation here and once you solve these two equations We will get alpha2 is equal to 26 divided by 121 and alpha3 is equal to minus 6 divided by 121. Once you know the value of alpha2 and alpha3, we can easily calculate the value of alpha1 because alpha1 is always equal to alpha2 plus alpha3 here. We will get alpha1 is equal to 20 divided by 121 in this case. So after the first step of SVM classifier, we were able to calculate alpha1, alpha2, alpha3. Next, we need to calculate the weight vector using this equation. Weight vector is always equivalent to alpha i y i xi bar for all i is equivalent to 1 to n here. So, we need to expand this equation, it will become alpha 1 y 1 x1 vector, alpha 2 y 2 x2 vector, alpha 3 y 3 x3 vector. So, alpha 1, how much is that? 20 divided by 121, y1, which is equivalent to how much? Minus x1 and x1 vector is nothing but what? Similarly, we need to put all those values here and we need to simplify this equation. Once you simplify this equation, we will get 2 by 11, 6 by 11 in this case. Now, this is the weight vector. So once you calculate the weight vector, what is the next step? We need to calculate the bias here. So bias is calculated using this equation 1 by 2 minimum of weight vector dot product with the x vector for all i. where y i is plus 1 that is nothing but we need to consider only the examples of positive class plus maximum of weight vector dot xi vector for all i where the target is minus 1 here. So, in this case y i is equal to plus 1 in two cases y i is equal to minus 1 in one case. So, we need to solve this equation. So, minimum of weight vector x2 weight vector x3 here because it is with respect to y2 and y3 here. plus the maximum of weight vector x1 because this is the only example with respect to a negative class here now we need to calculate weight vector into x2 weight vector is what this is the weight vector and this is the x2 here ah that's the dot product we need to calculate weight vector and x3 we need to calculate here we will get 38 divided by 11 38 divided by 11 similarly weight vector multiplied by x1 here again we will be getting 16 by 11 in this case now between these two minimum is 38 by 11 So, 38 by 11 plus 16 by 11, once you solve it, you will get 27 divided by 11 here. The meaning of this one is we have calculated bias is equal to 27 divided by 11 in this case. Now, the SVM classifier function is given by f of x vector is equal to weight vector dot product with x vector minus b here, where x vector is equal to x1 comma x2 in this case. so this will f of x vector will become weight vector weight vector is what you can see here 2 by 11 multiplied by x 1 plus 6 by 11 multiplied by this x 2 minus bias that is 27 by 11 in this case now f of x vector is equivalent to this one now if you want to get the maximal margin for the hyperplane we need to equate f of x vector is equal to 0 here so once you equate f of x vector is equivalent to 0 this will be equivalent to 0 here so this is the final hyperplane equation for the given data set now what we need to do is we need to draw this hyperplane here and one thing we need to remember if you look at these alpha 1 alpha 2 alpha 3 here alpha 1 is positive alpha 2 is positive here but alpha 3 is what negative in this case because of that we cannot consider x3 as a support vector we can consider only x1 and x2 as a support vector because we consider the support vector for which the alpha value is greater than 0 here so once you consider alpha 1 and alpha 2 as a support vector we can draw the hyperplane something like this as there are only two data points are there with respect to the support vectors, the hyperplane should go through the midpoint of these two data points here. So, if we calculate the midpoint here, 2 plus 4 is equal to 6, 6 divided by 2 is equal to 3 here, 2 plus 5 is 7, 7 divided by 2 is equal to 3.5 in this case. Now, if you put this 3 and 3.5 in this equation, you can see here, it will become 3 plus 3 into 3.5, that will be nothing but 10.5 here. minus this 27 divided by 2 is equal to 13.5 here. Now, once you simplify this equation, it will become 0, 0 is equal to 0 here. The meaning of this one is hyperplane is passing through the midpoint of these two data points perpendicular to this the line segment over here. So, this is how the hyperplane of this data set looks like. So, in this video, I have discussed how can we apply the SVM classifier for the given data set to find the hyperplane with maximum margin. I hope the concept of SVM classifier is clear. If you like the video, do like and share with your friends. Press the subscribe button for more videos. Press the bell icon for regular updates. Thank you for watching.

based on the number of input vectors here. So, in this case, we have three input vectors. So, we need to calculate the three alpha values here that is nothing but alpha 1, alpha 2, alpha 3 here. While calculating alpha 1, alpha 2, alpha 3, we have to consider certain conditions that is summation of alpha i y i should be equivalent to 0 for all possible values of i here.

as well as the value of alphas should be greater than 0 here. So, that is the one more condition we need to consider here. So, in the first step, we need to calculate the alpha values by maximizing the value of this particular function that is phi of alpha vector.

Now, I will replace the value of n here, the n value is 3 here. So, I have replaced n is equivalent to 3 in this case. Now, if you look at this thing, in this case, what we need to do here is we need to expand this particular first equation, it will become alpha 1 plus alpha 2 plus alpha 3. Second time, it will become first time i is equal to 1 and j is equal to 1. Second time, it will be i equals to 1 and j equals to 2. Third time, it will be i equals to 1 and j equals to 2 and so on. So, those are the different combinations we need to consider. So, for that one, we can write it over here.

But in this case, if you look at here, this is the dot product we need to calculate. so while calculating dot product if i is equal to 1 j is equal to 1 it will become x1 dot x2 here so that is what i have calculated and written here now how to get this x1 vector dot x2 x1 vector so look at here this is the x1 that is 2 comma 2 again x1 is 2 comma 2 here now how to calculate the dot product so this is the multiplication 2 into 2 is equal to 4 and this multiplication that is equivalent to 4 and we need to take the summation that is equivalent to 8 here similarly we need to calculate the other combination dot product that is i is equivalent to 1 and j is equivalent to 2 so that is what i have written here now how we got 18 here x1 vector dot x2 vector here so that is nothing but 2 into 4 plus 2 into 5 here 2 into 4 is 8 plus 2 into 5 that is 10 which is equivalent to 18 in this case similarly we need to calculate all possible combinations and once you calculate all these particular possible combinations we need to put the values over here we will get this equation that is phi of alpha vector is equivalent to alpha 1 plus alpha 2 plus alpha 3 that is this particular part similarly for the second part we will get 1 by 2 8 alpha 1 square plus 41 alpha 2 square plus 65 alpha 3 square and so on so once you get this equation What we need to do is we need to consider this constraint. What is the constraint here? Minus alpha 1 plus alpha 2 plus alpha 3 is equal to 0. That is nothing but alpha 1 is equal to alpha 2 plus alpha 3. Now, we need to take this constraint and then we need to simplify this equation.

Once you simplify this equation, that is wherever there is alpha 1, I have replaced that with alpha 2 plus alpha 3 here. So, once I replace it, You can see here, this alpha 1 is replaced with alpha 2 plus alpha 3. So, alpha 2 plus alpha 3 plus alpha 2 plus alpha 3, that is nothing but 2 times alpha 2 plus alpha 3. Similarly, if I replace alpha 1 with alpha 2 plus alpha 3, I will be getting this particular part of the equation here. Now, what we need to do is, we need to simplify this equation.

Again, it is a bit difficult to simplify this equation because we need to maximize the value of this function and we need to calculate the alpha 1, alpha 2, alpha 3 here. Now what we can do here is in this case, we can notice there are two variables are there alpha 2 and alpha 3. So we will differentiate this phi of alpha vector with respect to alpha 2 one time and then we will equate it to 0. Second time we will differentiate with respect to alpha 3 and then we will equate it to 0. and then we will calculate the values here now if i differentiate this equation with respect to alpha 2 you can notice here this one 2 times alpha 2 that will become 2 here and this is a constant and it will be replaced with 0 minus 1 by 2 13 alpha 2 square 13 alpha 2 square is nothing but what 2 into 13 into alpha 2 so that is nothing but 26 alpha 2 here plus this will be 32 alpha 3 that is 32 alpha 3 here and this will be 0 because 29 and alpha 3 will be constant here now if you simplify it this will become 26 alpha 2 divided by 2 is nothing but what 13 alpha 2 and 32 alpha 3 divided by 2 is nothing but what 16 alpha 3 here similarly once you differentiate this equation with respect to alpha 3 we will get this equation here and once you solve these two equations We will get alpha2 is equal to 26 divided by 121 and alpha3 is equal to minus 6 divided by 121. Once you know the value of alpha2 and alpha3, we can easily calculate the value of alpha1 because alpha1 is always equal to alpha2 plus alpha3 here. We will get alpha1 is equal to 20 divided by 121 in this case. So after the first step of SVM classifier, we were able to calculate alpha1, alpha2, alpha3. Next, we need to calculate the weight vector using this equation.

Weight vector is always equivalent to alpha i y i xi bar for all i is equivalent to 1 to n here. So, we need to expand this equation, it will become alpha 1 y 1 x1 vector, alpha 2 y 2 x2 vector, alpha 3 y 3 x3 vector. So, alpha 1, how much is that?

20 divided by 121, y1, which is equivalent to how much? Minus x1 and x1 vector is nothing but what? Similarly, we need to put all those values here and we need to simplify this equation. Once you simplify this equation, we will get 2 by 11, 6 by 11 in this case.

Now, this is the weight vector. So once you calculate the weight vector, what is the next step? We need to calculate the bias here.

So bias is calculated using this equation 1 by 2 minimum of weight vector dot product with the x vector for all i. where y i is plus 1 that is nothing but we need to consider only the examples of positive class plus maximum of weight vector dot xi vector for all i where the target is minus 1 here. So, in this case y i is equal to plus 1 in two cases y i is equal to minus 1 in one case. So, we need to solve this equation. So, minimum of weight vector x2 weight vector x3 here because it is with respect to y2 and y3 here.

plus the maximum of weight vector x1 because this is the only example with respect to a negative class here now we need to calculate weight vector into x2 weight vector is what this is the weight vector and this is the x2 here ah that's the dot product we need to calculate weight vector and x3 we need to calculate here we will get 38 divided by 11 38 divided by 11 similarly weight vector multiplied by x1 here again we will be getting 16 by 11 in this case now between these two minimum is 38 by 11 So, 38 by 11 plus 16 by 11, once you solve it, you will get 27 divided by 11 here. The meaning of this one is we have calculated bias is equal to 27 divided by 11 in this case. Now, the SVM classifier function is given by f of x vector is equal to weight vector dot product with x vector minus b here, where x vector is equal to x1 comma x2 in this case. so this will f of x vector will become weight vector weight vector is what you can see here 2 by 11 multiplied by x 1 plus 6 by 11 multiplied by this x 2 minus bias that is 27 by 11 in this case now f of x vector is equivalent to this one now if you want to get the maximal margin for the hyperplane we need to equate f of x vector is equal to 0 here so once you equate f of x vector is equivalent to 0 this will be equivalent to 0 here so this is the final hyperplane equation for the given data set now what we need to do is we need to draw this hyperplane here and one thing we need to remember if you look at these alpha 1 alpha 2 alpha 3 here alpha 1 is positive alpha 2 is positive here but alpha 3 is what negative in this case because of that we cannot consider x3 as a support vector we can consider only x1 and x2 as a support vector because we consider the support vector for which the alpha value is greater than 0 here so once you consider alpha 1 and alpha 2 as a support vector we can draw the hyperplane something like this as there are only two data points are there with respect to the support vectors, the hyperplane should go through the midpoint of these two data points here. So, if we calculate the midpoint here, 2 plus 4 is equal to 6, 6 divided by 2 is equal to 3 here, 2 plus 5 is 7, 7 divided by 2 is equal to 3.5 in this case.

Now, if you put this 3 and 3.5 in this equation, you can see here, it will become 3 plus 3 into 3.5, that will be nothing but 10.5 here. minus this 27 divided by 2 is equal to 13.5 here. Now, once you simplify this equation, it will become 0, 0 is equal to 0 here.

The meaning of this one is hyperplane is passing through the midpoint of these two data points perpendicular to this the line segment over here. So, this is how the hyperplane of this data set looks like. So, in this video, I have discussed how can we apply the SVM classifier for the given data set to find the hyperplane with maximum margin.

I hope the concept of SVM classifier is clear. If you like the video, do like and share with your friends. Press the subscribe button for more videos.

Press the bell icon for regular updates. Thank you for watching.

Transcript for:Understanding SVM Classifier and Hyperplane

Transcript for:
Understanding SVM Classifier and Hyperplane