the dictionary definition of artificial intelligence is a machine that can sense reason act and adapt but as we know everything starts from data collecting data storing it accessing it in a fast and efficient way labeling it now here's the question for you if you had the chance to sit down in a room full of the top 10 business exacts of all the time such as Steve Jobs or Bill Gates to mention a few and you were allowed to ask them any kind of questions but you only had one hour of time to spend with them what question would you ask them I'm sure we would all agree that the best way to get out the business Insight out of them would be to ask the question for example how they would behave if they were facing your same business challenge you can see that room as your hard disk full of historical data collection the problem is what are the right question to ask especially if you just started your AI Journey the answer is by knowing your goals if you know exactly what are you trying to achieve why are you trying to get to that point and what is the best approach you've planned so far you have a very good starting point as the analogy suggest having access to that room is just not enough it's not going to turn automatically into an AI magic tools that solves your business issue having access to the information is the starting point being able to query them is essential to extract the valuable knowledge hidden in it owning the data is just the beginning of the journey the process from owning to knowing what's in the data brings nontrivial challenges let's see what kind of challenges we're talking about some of these challenges are not necessarily related to Ani project those are there since the day one we start dealing with data they are well known for being the four vs the first V is volume it's actually the first challenge you bump into when you start talking about Big Data it comes exactly with the definition challenges like how do you store the data or redundant information how you do do you update those information how do you query those how do you back them up the second V is the velocity meaning the rate at which new data are being created or generated by your sources and sources could be other programs that are fitting new data Computing them summarizing it or new sensors a new devices to acquire new data how do you inject those information into the pipeline the third V is the variety literally how many different sources do we have to fit into our box AI box the new data the fourth V is veracity for example how reliable are the information that are flowing into your system what happen if a sensor is faulty or stop working for some reason and is still producing numbers capturing value which are completely unreliable to build any prediction detection or whatever intelligent algorithm these challenges the 4vs are there every time we considered to work with big data and they are not necessarily related to artificial intelligence about this last bit labeling having a labeled data set is essential every time we're planning to deal with supervised learning which by the way if you're not familiar with this concept just bear with me this is going to be the key Focus for the second video now the question you might have is but what is a label basically it's just a flag that we put next to every item or instance or data point call it as you wish in your data set to Define what category or class that object belongs to it of course change or might change according to what is the goal we are trying to achieve because of this I have many customers around the area with an historical data collection and their question might be okay but how can I get a labeled data set I do have the information how can I get the last bit well unless there is a way to automatically link the instances with the labels by combining different data sources in your company probably there is also a good chance you got to go through it manually there's plenty of activities that require manual labeling and that's how you can see why the data portion is going to contribute for more than 50% of the time to solution what we've seen so far applies pretty much to all the data you start with a well done procedure of labeling the data you're going to reuse that information up until the end point of the project all the challenges we've seen so far like the 4vs or the data plus AI challenge such as labeling a data set are going to be there regardless to what data format you're working with by data format would mean images or video you can see the video as a collection of images time series such as audio file or forecast text like document or post on social media and ultimately spreadsheet and table like everything which is tabular shape knowing that there are four categories of data or let's say big families in which we can group all the data format is a extremely helpful you know why because a common practice across all the businesses in the world is to split from an internal point of view the market in different Industries this Fleet might defer from one company to the other one but the bottom line is that since all of us when dealing with data for AI we're basically starting with one of those four categories images time series text or tabular information this suggests us that one solution that uses one of these data type might well be reused for a similar solution in a different industry for example think about retail if I would develop an intelligent algorithm to be able to spot the different products on a shelf and recognize them by their label I would likely be able to reuse the same technology to spot object within a luggage at the airport check this is just one of the example of how AI can be extremely reusable so if you're one of those company that are used to split the market in different Industries you might be happy to know that some solution that has been already found in one of those sector can be highly reused into another one by just switching the type of data and some adjusting of the model with this I really want to thank you and I hope you enjoy the video please come and join me for section two on what's machine learning in artificial [Music] intelligence e