Transcript for:
Machine Learning Algorithms Overview

namaskar good afternoon everyone today we uh will be doing the second session of this particular course the title of the course is AI and ml for uh geodata analysis so in today's session what we will be talking about is for machine learning algorithms what is machine learning and what are machine learning algorithms so prior to starting with the uh technical content of this session I would uh request uh I would again announce that since in this particular course we are having a large number of participants I request all the participants to watch The Sessions through YouTube uh I YouTube channel the participants who have registered as the individual participants then uh they can after the end of the session they can uh log into ISRO LMS and uh they can appear in the quiz and based on the daily quiz they will get the certificate the participants who have registered via the nodal coordinator they are also requested if they are not able to login into E-class then they are requested to watch these sessions through YouTube however for the for their attendance they can uh they can ask their coordinators to mark their attendance in uh ISRO uh in I LMS okay so I request cooperation from all the participants so that we all can you all can have a very fruitful learning experience in this these five days okay so coming back to the topic today we will be talking about machine learning algorithms and I'm Dr Punam sayari uh so we will be starting our session with the words of Professor uh GTH James he was a professor of data science operation in a big uh college and he said that for the 21st century data is the sore and whoever is able to handle that data properly he will be called as a samurai so uh we are in that era where the data is very very important and to handle a large amount of data we require Advanced algorithms and machine learning deep learning is one of the set of algorithms which we are applying in all spheres of life since we are having a large amount of data and these type of algorithms will only work when you have large amount of data so now let us talk about what is machine learning what are and what are so yesterday you H Madam has already discussed about what is artificial intelligence and she must also have taught told you that machine learning is a subset of artificial intelligence so artificial intelligence is any anything in any um function where the machine behaves like a human machine learning is a sub set of artificial intelligence in which the computer is trained to automate a particular task in machine learning what we do is uh the algorithms they use the data and it is fed into the algorithm the algorithm tries to understand uh the data set and it tries to understand the relationship between the input and the output uh uh data set that we have provided and it learns and it extracts the features it lears on its and when machine has finished learning it can predict the class of new data point or it can uh do the classification it can do any of these uh operations that we are talking about the Deep learning is a f further a subset of machine learning which uses uh deep neural network so tomorrow in the session we will be talking about what is neural network and how these neural networks are used for uh doing the doing the processing so um in deep learning the machine uses different layers a large number of neural networks or layers are used uh and progressively these layers go on learning from the data set uh how deep is the model tells me how many number of layers are there uh if the model is deep we assume that it will uh be having more uh means it will be a better model but it is uh not always true so these are uh what I have listed out here are the differences between the machine learning and deep learning algorithms so talking about the factors based on which we can differentiate between machine learning and deep learning so the first one being the problem solving approach how it is attempt attempting the problem so in case of machine learning what machine learning does is that the features which will be used for uh doing the task for doing classification in our case because our course is uh titled M machine learning deep learning for geodata analysis and in geodata analysis we are generally doing the classification and clustering of the data set so the features that we are using for classification that is extract Ed uh that is manually extracted in machine learning whereas in deep learning it has minimum uh human intervention and it does the means what features it will be using we don't know coming to the training uh methods uh between machine and deep learning uh in machine learning we have a large number of training methods we have supervised unsupervised reinforcement semi-supervised self-learning so many types of learning is there whereas in case of deep learning uh we have auto encoders scan generative adversor we have uh convolutional neural networks we have RNN CNN so uh these uh structures are there these architectures are there they learn on their own then how means what is the complexity of algorithm in case of machine learning we have diverse algorithms you will be seeing today that we will be starting with very basic algorithms and there are uh Advanced algorithms such as random forest classifier such as svm support which is support Vector machines whereas uh in case of so the complexity of algorithm varies whereas in case of deep learning we uh the architecture is basically uh made up of interconnected neurons and the architecture is uh somewhat complex uh now how it relies I means how it interprets the data so if in machine learnning we always uh the data should be well structured whereas in case of deep learning we uh since it is based on artificial neural networks it takes all the data set and it uh finds out the features itself so any type of means large amount of data is required and how the um means what would be the infrastructure and data requirement in case of both the machine learning and deep learning so uh machine learning uh data and models can be deployed and executed on single instance on or distributed across server uh cluster whereas deep learning models because they are quite complex and they rely on large amount of data set so it requires a large amount of storage uh and also computational power also so these were the key differences between machine and deep learning today in today's session we will be focusing mainly on machine learning so what is machine learning machine learning is a method of data analysis that automates analytical model so till now in the traditional programming what we were trying to do we were giving the program to the computer the program was written by the uh by the analyst by the programmer and that program was given to the computer along with that program when you provide the data it gives you an output so suppose you want to add some uh means create uh sum of two numbers 5 + 3 is equal to 8 so you have to uh give the program that value X Plus value y so you have to apply that um operator yourself okay and it gives you the output now whenever you've been any other value for x and y suppose now you fe uh value X as four and value y as five then the output that you will be getting will be nine now what happens in case of machine learning in machine learning we are not giving the program to the computer we are giving the input and the output here I have given you a very simple example so uh in machine learning we are giving the the input and the output that means this is the question this should be the answer this is the question this should be the answer for every data we are giving what should be the output like this and this type of data set is called as a training data so when we are training the computer we are providing all this information to the computer now now we don't the computer itself tries to find out what is the um what is the uh connection between the input and the output or what is the uh relationship between the input and the output and based on that it itself develops the program or it itself develops the logic so in case of machine learning instead of writing the program by hand we collect lots of example that specify the correct output for a given input now the machine learning algorithm takes these examples and produces a program that does the job uh and machine learning is programming computers to optimize a performance criteria using example data or past experiences okay now there can be three diff uh different type of machine learning algorithms we can have supervised algorithm we can have unsupervised algorithm we can have semi-supervised reinforcement learning so a few of them we will try to discuss here now what is supervised uh machine learning supervised means you are supervising the system that means you are uh making it learn using the labeled data that means you are telling the input and the output okay so that kind of uh algorithms are called as supervised algorithms unsupervised algorithms for unsupervised algorithm you are not giving any training data to the um to the uh system instead you are asking the system to discover the patterns in the unlabeled uh data okay so it itself tries to Cluster the similar data points okay reinforcement learning is uh the one in which uh the the computer or the algorithm uh predicts something means it does some job if the job is correct the reward is given or feedback is given to the system that the uh result is good so it will improve its result if the feedback is is uh negative that means if you are saying that the feed uh the it is not good then it will improve itself in that particular manner okay so these are basically types of machine learning supervised supervised reinforcement uh in case of geod dat analysis we are generally using supervised and unsupervised algorithms in supervised we can have classification algorithms we can have regression algorithms in unsupervised we can have clustering algorithms and Association analysis okay so lot many types of algorithms are there one by one we will be talking about them now basically what problems do we have in case uh using machine learning what problems do we have what do you want to do because nowadays you must have seen that ml is being used ml DL is being used in all sorts of programs but in our case we are only talking about geodata analysis in case of geodata analysis we are basically uh dealing with three major problems so here I have listed all the problems that can be done using machine learning it can be uh the time series uh uh problem also feature extraction anomaly detection so basically in our case we we are talking mainly about classification clustering and regression now what is classification classification is a problem where the answer to be learned is one of the finitely many possible values that means you have given uh a complete image and you say that find out the land use pattern so in the image there are so many number of pixels okay so for every pixel it has to classify that this pixel is a vegetation pixel or this pixel is a water pixel or or this pixel is a urban pixel so such type of problems are called as classification problem in which finite number number of values should can be there and this is a supervised learning problem then we can have uh some problems in which the output need not be finite number of values the output can be a continuous value that means it can be uh you can have 2.0 2.01 2.02 2.02 so all these values can be uh can can be we get as output values so such type of problems are called as regression problems so uh numerical weather modeling so all such type of uh numerical modelings are come under regression analysis and the this again is a supervised learning problem whereas clustering is an unsupervised learning problem where the structure to be learned is a set of clusters so instead of give uh mean instead of uh classifying that this pixel belongs to water or agriculture or Urban what this system has to do is it will make the clusters of similar pixel similar similarity can be of based on any uh parameter so that those parameters can be talked about later on but based on some similar uh similarity uh criteria it the system the algorithm it creates group of similar pixels and those group of similar pixels are called as clusters so these three are the basic uh machine learning problems in case of G dat analysis so coming to supervised learning in supervised learning now what is supervised learning to explain supervised learning I have given uh an example here so here you have some data some data in which there are tomatoes carrot and Bell papers everything is are there and we give uh some labels that means the picture of carrot and we name this as carrot picture of tomato we name this as tomato picture of bell paper we name it as Bell paper and we give this uh input the labels as input in my model the model gets trained and once theel model is trained we can test our model with the testing data also and see whether my model is correct correctly uh doing the prediction or not so once I'm satisfied with my model I can give the raw data and now on this data it will do the prediction and say how many carrots bell pepper and tomato are okay so here what I we are doing is we are training our model we are giving the training to our model there that is why such type of algorithms are called as supervised learning supervised learning algorithms are trained using labeled examples such as input where the desired output is known so now we can have training validation and finally prediction in the training we give the training data and we make the algorithm learn we create the model in the validation we take the testing data on the strained model we up we uh give this strained model the test data and see how accurately it is doing the prediction so once my model is validated that okay this is good enough model then I can use the same model and on that same model we can uh use new data set and it the prediction so this is overall the flow for supervised learning so in case of geodata analysis how it is doing suppose this is my image okay and I want to do a land use land cover classification we have uh looking at the image we have understood that okay there are majorly these six classes agriculture Dry River Reservoir Forest urban area and open Forest so what we will do is we will try try to identify pixels which are true representative of Agriculture suppose these yellow uh area this yellow aoi which I have marked this is we are sure that all the pixels below this yellow aoi area of Interest represent agriculture so I can select these pixels and I can mark them as agriculture so now what the computer will do is what my algorithm will do is it will create some uh it will do some uh statistical calculations it will find out means standard deviation of all the pixels group of pixels like this and this will be called as a training set or training data for agriculture like this we can give training data for dry River bed for water for Forest Urban and so on so once this all is given based on this data set now every pixel starting from the first pixel of the image every pixel will be labeled and it will be classified into any of these six classes now these are the uh this is the how this supervised classification is done in case of uh geod data now what is the disadvantage disadvantage is that it is limited to uh to learning from labeled data set which are often expensive so it is difficult to get the training data uh sometimes okay and so it might be if your training data is wrong your classification results will be wrong so uh if the data is less the training data is less or the training is not done properly then the performance may be po these are some of the uh supervised learning algorithms that you must have uh read about uh or heard about them uh in different literature so uh the learning algorithms can be classified into five different broad classes numerical classifiers parametric functions non-parametric functions symbolic and in symol learning we will not go into the details of these but we will definitely look at a few of these inorms so the most most simple uh before going into the algorithms we should also um understand that how the training is done whatever data TR whatever data we have collected it we will split it into three parts the training data validation data and testing data so training uh data is used for training the model uh validation data is used to determine what model hyper parameters what adjustment has to be done and testing data is used to get final performance of the of my of the model so uh generally whatever me suppose we have collected 100 pixels then 60 pixels we will keep for training 20 pixels we will keep for validation and 20 pixels we will keep for testing so this is in general that we are using sometimes we this is 60 2020 sometimes we use 770 30 also 70 uh for training and 30 for validation and testing 15 15 for validation and testing so uh these are some of the things which are mentioned in the literature so we can select any of these now the first algorithm that we will be talking about of machine learning is parfi classification algorithm it's a very simple algorithm as I have told you that um once you provide the training samples the system generates the mean and standard equation so let suppose we have a two band image two band data so I assume that you must be aware what are the different bands okay so uh suppose we have two band data so what we can do is uh in band one and band two we can do a plot just to explain it to you we have taken only two band data you plot the mean of all the classes mean in band one mean in band two mean in band one mean in band two for uh class three mean in band one and mean in band two for class two also so we have plotted the means and we have we can plot the standard deviation also so mean plus one standard deviation would be the maximum value maximum limit you can see and mean minus one standard deviation would be uh minimum limmit so maximum and minimum limit we have uh in both the bands we have plotted so when we plot these we will find that we have got a box sort of thing okay so boxes for all the classes are created based on mean and standard equation now paride algorithm is very simple any unknown pixel which falls in any of these boxes will be termed at that as or classified as that particular class that means suppose here we have uh five classes 1 2 3 four and five so pixel a you can see here pixel a is falling in class number in the Box for class number four that means Forest so pixel a will be classified as Forest whereas pixel B is not falling in any of the classes so pixel B will be left as unclassified pixel if we are using Pari classification algorith okay so what are the advantages it is very fast very simple and it gives a broad classification thus Narrows down the number of possible classes to which any pixel may be assigned before uh further uh timec consuming calculations are made okay so uh if your uh when you are plotting your clusters uh your training samples if you find that all the training samples are like this that means in any one of the band combination they are separated here you can see uh yellow is separate green is separate San is separate red and blue all are separate these are the training samples so if such is the case then you can go for Pari algorithm the next algorithm is minimum distance to means classification this is also very simple so here what we have to do is we have to only calculate the mean of all the uh training samples here we have these are the training samples you can see and we have calculated the mean of the training sample now you take any unknown pixel suppose uh in the same example I have taken unknown pixel as a calculate the spectral distance of a a with all the cluster means with all the cluster means this this this this and this training sample means so whichever distance is minimum the pixel will be assigned to that particular class okay so here again the pixel a will be assigned to class Forest whereas pixel b instead of being left unclassified it will be now assigned to class C class three that is vitman okay so this is minimum distance to means whichever uh pixel is having uh the distance is minimum between pixel and uh mean whichever distance is minimum the pixel will be put into that particular class okay the advantage is that since every pixel is a spectrally closer to one of the sample mean so there are there won't be any unclassified ping and it is also very simple very fast the dis disadvantage is that that the pixels which are which should have been left unclassified they also become classified because it is not taking class variability the standard deviation is not being considered during classification okay so this is a case where minimum distance should be taken the third one is mahalanobis decision rule mahalanobis decision is uh similar to minimum distance uh because as you can see here there we have not used the standard deviation or variance here we we are using the co-variance Matrix also in the equation so uh so that problem has been sorted out so the advantage is that it takes variability of the classes into account unlike minimum distance or parall p uh may be more useful that minimum distance in cases where statistical criteria must be taken into account but the uh since it is taking the co-variance Matrix directly so it tends to over classify the signatures with relatively larger value in convenience matx the equation is slightly bigger so it is slower to compute than parfi or minimum distance mahalanobis distance is a parametric algorithm so here in such type of cases you can where the sample value clouds overlap uh you can see it is having a complex shape so in such cases the malis decision rule the next one is maximum likelihood or base decision rule the uh maximum likelihood classification is based on the probability uh based on the probability that a pixel belongs to a particular class the basic equation uh used is uh take is assumes that all these probabilities are equal for the classes if the input bands have normal distribution so suppose we have some prior knowledge that the probabilities are not equal for all the classes then we can specify weights for different classes and when we are adding the weights then this maximum likelihood decision rule is known as base decision rule also okay so goian or maximum likelihood classifier assume that the feature Vector of each class are statistically distributed according to multivariate normal probability density function okay so you can see here in such type of cases where the training clouds are overlapping and all these are prolonged in shape prolonged in shape uh tells us that they are uh having large number a large amount of uh correlation so this indicates that we need to classify the image using maximum likelihood algorithm so the classification results looks something like this the advantages is that out of uh these many classification pixel based it is one of the most accurate classifier because it takes most variables into consideration the disadvantage is that it is an extensive equation and takes a long time to compute uh and it again tends to over classify signatures with relatively larger value in covariance Matrix now out of these algorithms that we have discussed how should we select which algorithm should be um should we take so first of all we should think uh we should uh go on answering these questions how many classes are there if we are only want to classify our image two classes then we should go for binary encoding if there are more number of classes so we should see do uh sample values overlap so if here you can see blue and brown they are not overlapping so in such cases paraloid algorithm would be the best if the uh values are overlapping then what is the shape of the Cloud if it is complex shape then we should go for mahalanobis decision rule if the shape is simple then we have to see whether there is a correlation between sample brightness correlation I told you the elongated nature shows correlation so if the correlation is there then maximum likelihood so in this case red and blue you can we can go for maximum likelihood and if there is no correlation we can go for minimum distance to means algorithm now coming to some uh Advanced algorithms uh used in machine learning so the next one that we will be discussing is decision tree decision tree is something that we are doing every day it is a type of multi-stage classifier that can be applied to a single image or a stack of images now what is done in case of decision tree decision tree is made up of a series of binary decisions that are used to determine the correct category for each pixel now this uh these decisions can be based on available characteristics of the data set no single decision in the tree performs the complete classification or segmentation so you can see suppose I decision tree is very simple to understand suppose I want to uh go out to play uh football L tennis okay so first of all what what we will look we will go out and see how is the weather if it is raining that means noow if it is overcast we will see if it is sunny then we will check how much is the humidity okay so Sunny that means yes then we will check the humidity if it is high humidity you cannot go out to play and if is normal humidity you can go out to play so here you can see each node internal node checks a particular attribute once that is uh that comes to the next node each branch corresponds to an attribute value node and each Leaf node assigns a classification so classification is not done only by one yes or no each decision divides the data into one or two possible classes or group of classes and go on this yes and no yes and no goes on and we are able to decide on which class this particular pixel is to be taken here suppose here uh we have we are using hierarchial decision tree classifier so these are the rule sets so if the aspect is less than greater than 300 or um aspect is between 300 and 359 you have to go for rule a okay so aspect is true you have to see whether the habitat conditions are there we can my hypothesis would be proved okay here you are seeing whether the in value is greater than equal to 44 um less than equal to 52 31 so all all this rule sets are there and ndv is less than uh within 2.2 2.7 and so means so many rules are there if everything goes on yes then only it will say that this is white fur or something means a particular species here again you can see uh in band four whether the value is less than equal to 55 then check band two whether value is less than equal to 30 then that means it is agriculture if it is greater than 30 check band One value if it is less than equal to 81 less dense builtup greater than 81 then check B2 Band 2 if it is less than equal to 46 then shallow water greater than 46 then it is dense built up like this you go on making different uh notes and doing the uh taking the decision now uh the advantage of decision tree is that it is easy to understand understand and interpret and it is perfect for visual representation because we also we as humans also make our decisions in this manner only this is an example of a white box model white box means everything is uh in front of us and it closely mimics the human human decision making process it can work with numerical and categorical features both type of data assets can be handled it requires little data pre-processing so uh uh no uh dummy variables nothing means no calculations nothing is done simply uh means pre-processing is very very uh less it is fast for inference and uh it doesn't uh make any assumptions about the shape of the data set feature selection happens automatically unimportant features will not influence the result the presence of features that depend on each other also doesn't affect the quality the disadvantage of decision tree is that it tends to overfit um in create a model which overfits and uh gives very good result in case of training data and if we give another data set it gives us not so good results okay another uh algorithm is means uh classif ification and regression tree uh so in short it is it is abbreviated as car now this cart is the decision tree algorithm only which can be used for classification as well as regression uh problems this algorithm is available in Google Earth engine and um there would be one uh one demonstration on Google Earth engine okay so cart is Vision tree only but it uh it is it can be used for classification problems as well as uh regression problems now as I have told you that uh the decision trees they try to uh overfit uh the model create a model which overfits that means the training it gives me very good results in case of training data but not so good result in case of the other set of data to deal with such type of problems we um we can use another type of algorithm that is called as random Forest classifier now what is there in a for in a forest in a forest there are large number of trees so the same thing same logic we are using here also in random Forest it in the previous uh slides we have talked about one single decision tree so why should we use only one tree we can build up multiple decision trees so the same thing happens in random Forest classification in this type of algorithm multiple decision trees are created and the for when we are creating these trains then the training data that also we are giving different training data so whatever is the set of training data we are randomly choosing um the tra uh a subset of training data and providing it to individual tree now individual trees are uh are asked to do the means they perform the task they do the classification or they do uh the task and they give us the results now for all the trees the results are checked and uh the final result will be uh one the mag each tree votes each tree gives a vote suppose here you can see the tree one says that it is a cat tree two says uh it is dog tree three again says it is a cat and like this we can have any number of trees so for whichever um class maximum trees vot that the pixel is put into that particular class uh so votes are tallied to reach the final prediction okay so this concept is the concept of emble learning emble learning is the process of combining multiple classifiers now to solve a complex problem or to improve the performance of the model so here what we are doing doing we are making large number of trees many uh trees are made so we are using multiple classifiers and why we are using multiple classifiers so that we can improve the performance of the model so the greater number of trees why we are using large number of trees because so that we can solve the problem of overfitting that we have uh that we can encounter when we are using only one single decision tree and higher accuracy can be achieved okay so this is what is random Forest classifier now when we are talking about uh in symbol then there are two types of uh uh methods by which the classifiers can be assembled to give better results so one is bagging and one is boosting bagging is parallel um parallel in which uh the different uh different classifiers give predict results and based on that the final uh prediction is made the boosting is another type of ensemble in uh emble method in which the results of one classifier are sent to the other classifier to improve its results then these results are sent to the third classifier and so on and finally the results are improved and the uh predictions are made but anyhow this bagging and boosting is not is uh not related here we were talking only about um random Forest classifi now in random Forest classifier as we as I already said that while growing the trees while making the trees we are um adding the randomness into our model and instead of searching for most important feature while splitting a node it searches for the best feature among a random subset of features because what is happening we are giving train this is my training set within the training set we are splitting that data into training data one 2 three four five and N number of training data okay uh this train the entire training set is randomly splitted okay there can be some overlapping also there can be some uh different type of training data also so from training data one decision 31 would be trained uh from training data two decision 32 would be trained and so on and they will uh do their predictions finally the majority voting would be done and fin prediction would be done okay so the same thing is uh depicted here also okay so uh another image which uh this is a very nice concept very uh interesting concept so I have put a few slides on this now what are the properties of random Forest the first one is diversity uh because we are taking different type of training data so all the attributes all the variables all the features are not taken into account when we are creating one individual tree every tree is unique in its nature because it is getting trained by a different set of training data so every tree is unique it uh because each tree does not evaluate all characteristic the characteristics the feature SPAC is decreased so it is looking at only some type of um training data only one set of training data so the curse of dimensionality can be curse of dimensionality means when you have large number of data so that large number of data is always not useful for you it many of times it um gives you lot of problems also so that is called as curse of dimensionality so that also it handles then it uh does it parallely so each tree is built individually from various data and properties this means that we can fully utilize the CPU to construct random Forest so the parallelization of the um of U computational resources can be done here we need not uh separate the data for training and testing because decision tree is always means is not using the entire training data some data it is not using some data is used by some other tree and it is the stability comes because the final outcome is based on the majority voting or averaging so these are the advantages of random Forest uh it is capable of Performing both classification and regression it is capable of handling large data set with high dimensionality it enhances the accuracy of the model and prevents the overfitting issue uh even though we can use random forest for uh regression analysis also but it is not suitable for regression analysis so why should we go for random Forest because it takes less training time as compared to other algorithms because it is using a part of the training data it predicts output with high accuracy even with for large data set it runs efficiently it can also maintain accuracy when large prop proportion of data is missing the next algorithm that we will talk about is support Vector machine svm support Vector machine is again a supervised machine learning algorithm which is used both for classification and regression tasks uh even though we are saying that it can be used for regression but it is uh it is best suitable for classification now in support Vector algorithm I I go to the next slide now when we are talking about hyperspectral kind of data in which l large number of bands are there in which large data sets are there large sets of data that means one set of data is one band another set of data is another band so like this when you are talking about hyperspectral data so in hyperspectral data uh the svm technique the support Vector uh machine is uh tries to find a hyper plane a chooses it chooses a hyper plane which maximizes the margin between the classes so here you can see this blue is one class red is one class you can have multiple hyper planes multiple hyper hyper why we are using hyper because so many number of dimensions are there that is why we are using the term hyperplanes so one of the hyper plane would be such so that it has maximum distance from both the uh classes so svm tries to choose a hyperplane that maximizes the margin between classes the points which are closest to these this hyper plane are are supporting this hyper plane so they are called as support vectors so we should read the so the main objective of SPM algorithm is to find Optimal hyperplane in an N dimensional space that can separate the data points in different class classes in the feature space the hyper plane tries that the margin between closest point of different classes should be as maximum as possible the dimension of hyperplane depends on number of features if the number of input feature is two then the hyper plane will be just a line if the number of input features is three then hyper plane becomes a two-dimensional plane greater than that we can only imagine okay now what is the advantage and disadvantage so it uh works really well with a clear margin of Separation it is effective in high dimensional space so when a large number of uh bands are there in such a cases it is good it is effective in cases when number of Dimension is greater than number of samples it a subset of training point in the decision function so it is also Al memory efficient uh if the data set is large so the training time also will be very high so that is one disadvantage and when the data set has lot of noise then also it uh the svm will not perform very good the next one is artificial immune networks uh it is inspired by need uh by the natural immune system of human body and it is uh applied for solving complex computational problems in classification pattern recognition and optimization it is based on principle of behavior of both P cell anti cell in biological immune system okay so uh then we have this logistic regression regressions we have talked about the classification problem so regression is a statistical method for analyzing data set in which there are one or more independent variables that determine an outcome the outcome is measured with a dichotomous variable the goal of logistic regression is to find best fitting model to describe relationship between the dichotomous characteristic of inp of interest and set of independent variables so it is logistic regression is a method of binary classification so uh only um one set of classification it can do it can discrete between zero and one so it does not require too many computational resources so in cases where we are we have to use only or we have to classify our data only in two class yes or no then we can go for this logistic regression type of thing the second type of classification algorithms unsupervised learning as the name itself suggest we are not uh giving the training to the uh to our system to our models instead we are asking the computer to find out the natural pattern in the data set so I would like to explain this with help of this figure you can see here so if you look at it uh carefully you will see that this pixel which is having value 12 14 and 11 they are similar in terms I will not say it is same it is similar similarly 178 177 181 180 183 value pixels are Al similar so we can assign them into cluster a cluster B cluster C so on so this is what is called as unsupervised uh clustering so supervised learning method in which unlabelled data points are grouped into clusters that share similar properties the two uh UNS supervised uh learning tasks mainly are clustering and reducing dimensionality so these are uh the unsupervised learning uh algorithms class strength algorithms K means clustering spectral clustering then we have dimensionality reduction such as uh PCA principal component analysis factor analysis and okay so the UNS supervised uh what is the advantage of unsupervised learning so uh clustering automatically splits the data set into group based on their similarities you can use it for animal detection also animaly detection can discover unusual data point it is useful for finding fraudulent transactions Association mining identi sets of items which often occur together in the data set what are the disadvantage we cannot get precise information regarding data sorting and the output as data used in unsupervised learning is labeled and not know uh the results have less accuracy because the input data is not known and labeled by people the spectral classes are generally always not um correspond to informational classes such as the agriculture field agriculture field with standing crop with crop with where the crop has just been sold so all of them will have different spectral values the user needs time to spend interpreting and labeling the class which follow that classification so uh an algorithm of unsupervised classification that is K means clustering so assigns data points to one of the K clusters depending on their distance from the center of the cluster so from the center of the cluster how uh spectrally how at how much distance the particular pixel is starts by randomly assigning the Clusters centroid in the space then each data point assigned to one of the cluster based on its distance from the centroid of the cluster this process runs iteratively until it finds good cluster okay so now these are the differences between supervised and unsupervised learning in the what is the objective of supervised learning uh it is to approximate a function that Maps inputs to outputs based on the training data unsupervised is to build build a concise representation of the data and generate imaginative content from it the supervised may be highly accurate and reliable uh unsupervised would be less accurate but it is reliable because it is uh it takes natural clusters into account the supervised is a simpler method whereas unsupervised is computationally complex in the supervised learning the number of classes is no uh you can classify into 10 15 20 uh classes whereas in unsupervised class learning you can have unknown number of classes clusters the in supervised learning the output is uh is a desired output value whereas unsupervised learning there is no corresponding output value then there are some U methods of semisupervised learning by using semi semi supervised means uh the uh we have the labeled points or we have the training data but we have very few labeled data okay so uh based on this few labeled data you can uh do the initial classifier optimization then classify the unlabeled data with trained classifier to label them and finally retrain the classifier okay so by using semisupervised learning it is possible to combine the uh advantages of working with small label data set to guide the learning process and larger unlabelled data set to increase the generalizability of the solution then this is uh reinforcement learning uh in which the software is trained to make decisions to achieve the most optimal result it mimics the trial and error learning process that humans used to achieve their goals software actions that work towards the goals are reinforced while actions that detract from the goal are ignored it follows the concept of hit and trial method the agent is rewarded and penalized with a point for a correct or a wrong answer and on the basis of positive Reward Points gain the model trains itself once trained it gets ready to predict the new data presented to it the reinforcement learning algorithms use a reward and Punishment Paradigm as they process the data they learn from the feedback of each action and self-discover the best processing paths to achieve final outcomes the benefits of reinforcement learning uh in complex environments it gives us a very good uh accuracy so it can be used in complex environment with many rules tools and dependencies it requires less human interaction uh in case of traditional machine learning algorithms humans must label data pairs to direct the algorithm uh reinforcement learning algorithm learns by itself and it also optimizes for long-term goals so this is all for uh for uh the session on machine learning thank you for joining the session we'll uh take a small break and then we will reassemble for the question and answer session thank you