Interview Preparation: Support Vector Machines (SVM)

hello all my name is krishna and welcome to my youtube channel so guys we are into the interview preparation of day three and in this video we are going to discuss about the most important interview questions with respect to support vector machines uh with the help of svm we'll be able to solve both classification and regression problem statement guys and before going ahead with this video what i really want to say is that i'll be asking every day with respect to this interview preparation either one or two questions so the first question that you actually have and which you need to answer is that which all machine learning algorithms is basically impacted by imbalanced data set so this is basically the question from my side as an interview question if you know the answer please do write the comment in the description of this particular video and definitely if i see the amazing answers from you i'll be definitely replying back okay so in this video again we are actually trying to understand that how to learn machine learning algorithms for interview and in our previous video we have already seen name bias we have seen linear regression in this video we will be understanding about support vector machines now support vector machine is an amazing algorithm guys because it has a lot of things within it you know it is actually with the help of kernels right in support vector machine we have something called as kernels we have like rbf kernel we have linear kernel sigmoid kernel so we are actually able to solve both linear and non-linear classification or regression problem statements where the data is distributed in that particular way with respect to this again i've cricket created this materials for you all i've given which are links you need to actually refer so first of all in order to understand the theoretical understanding i have given you two links the two links is with respect to support vector machines part one and uh you can also see maths intuition behind support vector machine part two please make sure that you cover this two this is the most important things because in this video i've discussed about what is support vectors what is hyper planes marginal distance linear separable non-linear separable what is soft margin what is hard margin everything which is pretty much important which is the base of support vector machines and always remember guys whenever you are learning a machine learning algorithm which actually helps you to solve both classification and regression problem statement you should be able to make the recruiter understand that how the classification actually happens in case of regression how the regression actually happens now in this particular case you can see that there is a hard margin and soft marriage just go through this two videos these two videos are must again the link is actually given in the video this is based on the theoretical understanding which is pretty much important it includes all the maths in these two videos okay there is one more video which i actually want to discuss is about svm kernel and since i did not get time to upload that there was an amazing lecture from andrew ng with respect to kernels you can actually follow this particular video where he has explained amazingly with the help of kernels you know the svm kernels again that particular link is given over here okay where i've actually discussed about disadvantages now coming to the next question okay when you say that yes i've used svm in this specific project let it be a classification or regression it has two variations guys svc as we are svc is nothing but support vector classifier svr is nothing but support vector regression okay so what are the basic assumptions the first question that arises there is no such assumptions in linear regression we saw the four basic assumptions right and again if you know this particular answer again comment down guys in my yesterday's video actually about that regression now what are the advantages with respect to svm svm is more effective in high dimensional spaces now this is pretty much important for you all to understand whenever you have feature with high dimensions svm usually works well again the question may be how and why right understand guys if you understand the theoretical explanation of svm it focuses on creating support vectors hyperplane support vectors along with hyper planes right now when they are constructing hyper planes if you have many number of dimensions they actually tend to apply something called as kernels you know in short kernels it is just not like reducing the dimensions suppose i have many points in a plane what they'll do is that with the help of kernels they'll try to bulge that particular plane so that clearly the division will actually happen just go and watch this through theoretical video i've explained you about that so it usually works well with high dimension the reason is that because of this kernels the kernels that you have like rpf you have poly linear sigmoid all these kernels are effectively actually helping to solve this particular problem now other than that guys you can also see that it is relatively memory efficient this is the another advantages uh with respect to this uh and suppose if you really don't know much about the data you know you don't have much clarity on the data what the data is actually doing svm can also play a very important role with respect to that right this is basically the third point it works well with structured and semi-structured data like text images and trees one thing about svm guys i think you have heard about it in uh it is basically being used with uh you know the a n classification problems also over there you know where you can actually use in the last layer that also i have actually seen the kernel trick is the real strength of svm with an appropriate kernel function we can solve any complex problem this is absolutely true guys okay but again there are some disadvantages with respect to this um svm models have generalization practice the risk of overfitting is also less than svm so apart from this there are also other parameters which are actually required with respect to like you have this degree gamma these all are actually a good hyper parameters to work with apart from that the c parameter is also important this is a regularization parameter and the penalty squared l2 penalty which we basically say again now people may be thinking i'm just talking about advantages and disadvantages what you have to tell the interviewers if the interviewer asks you okay why why do you think you have used svm then you can say that in our in my project the data set had many many high dimensional features uh in short so for that i have actually used svm because we have some amazing kernel functionalities in svc or svr right like this kind of questions we may definitely get applied and whatever advantages i have written over here why the overfitting is less in svm you can search that in google to again get more information with respect to that particular algorithm now coming to the disadvantages more training time is required for the data set this is one of the major disadvantage guys it is slow uh this this is not i am saying guys with respect to any automated libraries also if you use over then if you train around seven to eight different machine learning algorithms and there also you'll be finding that svm usually takes more time with respect to the larger data set okay and this is also a very important second point guys it is very very difficult to choose a good kernel function like out of this which kernel function we need to apply right rbf or whether we need to apply poly whether we need to apply linear you know it becomes sometimes difficult even though you provide some kind of hyperparameter tuning and then also when you try probably you'll not be able to achieve that particular kernel very quickly which may perform well again if you really want to know about the kernel this is the information regarding the kernel and again guys these all are very important things because as i'm saying you and i've written away and i'm saying you but if i really ask the questions like why do more training time is required for larger data set now this is the thing that you need to explore the reason why i'm telling you need to explore because then only you'll be getting more more more information okay and with respect to the c also guys this parameter is also a difficult way of acting actually hyper parameter tuning like how to find out the exact value of this c is also very very difficult that is what i've actually written in this disadvantages again whether feature scaling is required or not definitely feature scaling will be required because as you know that we create hard margin or soft margin hard margin soft margin is nothing but this lines right this is what happens in support vector machines now based on that only you'll be able to say and whatever i'm telling guys now see in this example in this diagram that you see right there two points right we cannot just divide this by using a simple straight line right so for this we apply kernels now the kernels what they do they do is that the points which are there right they will be actually putting up in two different dimensions then right two different dimensions so the first dimension can be actually uh those can be actually divided through a hyperplane that is what a kernel does in this particular case you have two dimension this may get converted into a three dimension and with the help of hyper plane it will be actually divided so something like that the kernel functionalities are there and again guys just give me some more time i will be uploading videos on kernels also and many people are actually requesting that but again due to time i don't have that much time i'm not able to do that okay but sure i will be uploading it now impact of missing value although is svm are an attractive option when constructing a classifier it did not easily accommodate missing co-weighted information similarly to the other prediction and classification so definitely uh if i talk about this it is sensitive to missing classifier again the impact of outliers also it is again sensitive to outliers and just i'm not saying like this i've also given some research paper where you can actually have a look so this is the research paper it says that despite its popularity svm has a serious drawback uh that is sensitive to outliers and training samples this is pretty much important you need to understand about this also guys it uses the convex loss using hinge loss and unbounded of the convex laws causes the sensitivity to to the outliers now again guys now you may be thinking krish how come you're just randomly coming i've just explored this guys this is the proper way of learning any algorithms whatever important things are there i'm putting over here okay you just have to prepare yourself i've given you the materials everything so here is your theoretical video here is your research paper and here is this suppose if you have actually solved your any use case with the help of svm just try to follow this the type of problems that it can solve both classification and regression problem we have two variation as i said svc as vr okay so try to again learn that uh overfitting and under fitting definitely in svm you can you can perform something called as you can create a soft margin instead of a hard one right soft margin basically means that they will allow some points to enter the margin intentionally so if you if you see over here right with respect to this this is actually if i say hard margin like this right none of the points can be inside this inside this region if i say soft margin then some of the points are allowed over here right so this is the basic difference between hard margin soft margin and why it is done to actually uh you know reduce this over fitting problem it is actually done okay to avoid overfitting they basically use this and apart from that uh you know uh they also use that c parameter to pen uh to penalize some of the classifier with respect to the overfitting also okay different problem statements that you can solve using an a bias i've written we can use svm with every and use cases where in the last layer suppose if it is a classification problem i can actually cut that last layer and use svm you have something functionalities like intrusion detection handwritten recognition if you really want more examples i've given over here in svc so and one more thing i just go and see each and every parameter just read it at least some idea will come to know and suppose if you really want to implement some of the things here it is you can implement it but remember there are two important things that you need to worry about is that yes it is sensitive to outlier it is sensitive to missing value and yes you have to do scaling if they say that why it works with high dimension of features you have to tell because of the uh kernel trick that is actually present in svc or svr we will be able to solve that particular problem efficiently okay and here you can see that there are some more examples of which you can actually go through it just go and see this guy see recognizing handwritten digits right so here you have this is what the code is this is the code all together you can execute it just execute it in one go you'll be able to do it i think it is pretty much simple here you can see that they have used svc they've used the gamma as point zero zero one you can see over here similarly in the case of svr just try to have a look on some of the examples uh that is given over here comparison of kernel ridge regression as vr support vector regression using linear and non-linear kernels i think you'll be able to do it very nicely now see in this diagram what it is this is my linear model right now if i create an rbf model i'm able to solve this particular problem properly if i'm having polynomial model then still the problem statement is actually getting solved very very easily so this is the whole code over there you can download the code and you can try it by yourself again after you do that much things guys after because i've mentioned every every possible questions again in the virtual interview i think no one has told me chris asked me about svm right in virtual interviews always they go with ensembl technique like random foreign but yes svm is pretty much good if you understand this theoretical videos and here i've discussed about the maths everything over here you'll be able to understand it very very nicely so i hope you like this particular video please do subscribe the channel and share with all your friends i'll see you in the next video have a great day thank you bye bye

Transcript for:Interview Preparation: Support Vector Machines (SVM)

Transcript for:
Interview Preparation: Support Vector Machines (SVM)