Hello everyone welcome back to my youtube channel troublefree in this video I am going to explain you about major issues in data mining in the subject of data mining okay so I am going to speak very very very slowly in the video if you're not able to you know catch up then I suggest you to listen to the video on 1.5 or 1.75x because you know i have to make a lot and lot of videos so if i speak fastly you know i get tired only um by first or second video so that's why i'm going to speak very slowly so yeah now let's get into the topic first what are the major issues that you are going to find that you are going to face in data mining that is when you are mining the data what are the issues that you'll be facing and how you have to overcome overcoming those issues may not be discussed in this video but you'll just get an idea of what issues are there okay so first mining different kinds of knowledge in database see actually in the database or in the data mining system there will be so many users right there will be so many users and each user will have each interest each user will have each need each use right So, based on the needs, you have to mine different different kinds of knowledge, right? That is, that is why it is important for a data mining system to cover all the range of knowledge available, right? So simple for this you can what you can write is there are so many users in the data mining system, each user will have different needs.
And in order to satisfy the needs of each and every user, different different needs of each and every user, the data. you know mining system has to be very the data mining system has to cover a range of all types of knowledges that's all next first one done interactive mining of knowledge at multiple levels of abstraction see interactive means what when you are mining some data suppose you are searching for suppose for example let us you know suppose that you want to get the roll numbers of all the students who belong to cse and um you know whose name starts with s and whose whose gender is female okay so these this is your search criteria okay fine so here and you also have another criteria like you can select only up to 10 members you need you need them for a project or so you can select only up to 10 members so first what filter you will apply you will apply to cse then how many records you're getting and based on that you will apply a filter that i want name starting with s then if you get again 20 results you got but you want how many you want only 10 results right then you will apply this result female right but instead of that if you apply all these things directly at once you may get only three results right at that time 10 is not achieved right so that is why you have to go level by level right so first when you search for cse you have to get the results of cse when you search for a number of students whose name is starting with s in csc then you have to get the students who are starting with s then female like that you have the it has to be interactive at multiple levels of abstraction multiple levels got it this is about second thing next incorporation of background knowledge see background knowledge not only in case of data mining anywhere if you are starting a new project or if you are starting new subject or whatever it is you have to see the background of the subject like for example data mining only let us take if you want to start the preparation of data mining for your semester exams you have to have some background background right you will go and search for pdfs in the google or you will search for videos on youtube or you will search for some notes from your faculty like that you will collect all the background data right in order to start your preparation so this is also the same even in the data mining also the background domain about the sorry the background knowledge of the domain which you are implementing should be used that has to be included into the current and those have to be used and you have to do the current data mining processes got it done next presentation and visualization of data mining results that is see for example again i'll tell you you are you you did a very very big you know a very of nice project, mini project or major project you will have right, you did a really a very nice project and if you fail to present the project properly that is either through a ppt or either through explanation or either through documentation or whatever it is if you fail to present the model perfectly then how much ever perfectly you have done that has gone into waste right so in the same way whatever the data mining results that you have obtained those results should be presented and visualized to the user in a very good way so that the user can easily understand everybody can easily understand okay so whatever results you have obtained out of data mining task all those results should be published or all those results should be presented in a way so that everybody can easily understand okay next handling noisy or incomplete data so what do you mean by handling noisy or incomplete data so what is noisy data what is incomplete data noisy data is nothing but errors in our data so incomplete data is nothing but missing values so about this in the next video of data pre-processing you will understand more clearly and handling noisy or incomplete data is also very important because when you are doing the data mining task it has to be done very efficiently and very accurately right so because of these data because of these noisy or incomplete data it has to not create a problem for you got it so for this we will be using a method called as data cleaning okay this you will understand more detail in the data pre-processing don't worry next efficiency and scalability of data mining algorithm so obviously the data mining you are extracting information from huge amount of data right so that is the reason why the data mining should be always effective and scalable so that scalable in the sense what if you are the data mining algorithm has to work efficiently if you are extracting from 100 records or the same algorithm has to work with the same performance if you are extracting from thousand records so whatever how many ever whatever is the number the performance have to be same that is what scalability means efficiency means you already know it so this is all about the major issues in data mining okay so that uh you know these headings may seem uh you know confusing for you to remember but still that's okay okay if you remember it in your own words and write it in your own words got it so that's all for this video thanks for watching the video till the end in the next video I'll be explaining about data processing and so on so thanks for watching the video till the end let's meet up in the next coming video with another topic still then stay tuned to my channel for more such videos