Economics of AI Lecture Notes

good morning good evening everyone we'll start in a few minutes very good morning good evening good morning good morning good evening everybody all right we'll get started it's 5 minutes above the joining time uh s but there are still few folks uh requesting to be moved to panelists if you could help with that that would be great yes sure I'm promoting okay so um let me jump in and share my screen and then we'll get started it all right so um today we would be talking about the economics of AI um and as you will see this word economics of AI has multiple interpretations um we will talk about U at least three big Ideas actually Four Big Ideas um will U which have implications on how you should think about expenditure and where does the cost go when you're trying to adopt AI in your organization so in some sense that's the microeconomics view of AI uh as to what you should be thinking from a budget planning perspective um when you think about a as I said where do the costs go in building an AI model and why this is such an expensive field um to to enter into uh the second aspect would be the macroeconomics um of aous is impacting things at a global scale we'll talk a little bit about that um and then we'll also talk about the impact on um labor economics to some sense or rather the role of humans uh as it concerns and where does the how does what does that mean that's the other aspect of Economics Affair we look at that so those are three or four broad themes that we will look at right so let's jump in um the idea is to kind of give you the money picture today right um so if you kind of take a step back and think about what we have covered till date in our lectures um you would have gotten a fairly good feel of the fact that the performance of a machine learning or AI algorithm broadly depends on three big things right or three themes if you may uh the first one is computational power right how what is the computational power available to train the models right um this is measured in a unit called flops which is basically floating Point operations per second right uh which basically you can think of it as a measure of how many operations uh that you need to do to kind of how many operations you can do per second is a measure of how much computation power that you have so you will see that if you're especially if you're working with large data sets uh then you need access to computation power and that is one place that lead up money U computation power the other thing that it's a um money is data and of course the most obvious thing to think about is the size of the data that you have and that for sure is a significant contributor to the cost that you will end up bearing right but we'll also talk about diversity of data and this phenomena if you have not heard of it before is a very interesting phenomena called the longtail of data and that creates significant challenges as as well as ADD significant costs to when you're trying to build Ai and ml models so that is the second aspect where which impacts the cost of doing an AIML product or a solution and the third of course is the machine learning algorithms or machine learning models which C take eat up money and which cost money right and what exactly does that mean why have I kept computing algorithm separately uh you will see as we move along the session today right uh longtail of data is something that I'll talk about it's a data distribution if you are familiar with data distribution Sumer just like you have a normal law normal distribution there's something called a power law distribution so the mathematical name is the power law distribution the more popular science name is longtail and I will cover that when I talk about data today okay all right so so let's start with computation right what does it mean and what you need to be aware of when you think about the computational required to train AIML models now this of course is tricky to explain and measure because most of it is done inside private walls or inside Enterprise boundaries but uh for for a lot of models this there is a lot of information available in the public domain also so we can rely on that and that's what I would be doing I'll be relying on information available public domain public data for you to help you develop an appreciation of the economics of AI when it comes to the computational cost and of course the computation cost is predominantly spent when you're trying to train a model right remember if I if you recall training of a model is fundamentally the idea that you are looking you're giving this an Al giving an algorithm a lot of data and you're asking the algorithm to learn from that data and capture the learning of the data in a software object called a model machine learning model or AI model so a lot of uh expenditure or money gets spent in training a model which is probably the most expensive part of building a model right so training computation or the computation cost on training your model um or the expenditure that you spend on computation while training a model is one of the three fundamental factors that drive the capability of the system right when I say capability of the system I'm talking about your AI or ml model how capable it is how powerful it is how much can it learn how well it can predict what is the size of data it can consume are all bucketed into systems capabilities right so the key idea is that the training computation is one of the key factors the other factors as we saw in the previous slide are the kind of algorithm that you use and the input data and the parameters used right now this is a graph um and let me zoom in and try to explain what's going on um in this particular graph right so what you see on the horizontal axis is of course time right we start 1950 and it's I can actually go to the more detailed version but first let me just look at this on the horizontal axis is time uh on the vertical axis is the number of flops right remember I talked about how you measure measure computation is the floating points operations per second so on the vertical axis is the computational spent on training a model and on the horizontal axis is time right um I will also draw your attention to the fact that the vertical axis is not linear scale right so a fixed size represents an order of multiple orders of magnitude U computation increase right so if you look at you go from 1 to 100,000 and from 100,000 to 10 billion flops and if you go down you go from one Petra flop to 0.01 Petra flop so the vertical axis is actually on a log scale and each one one of the Dots here is a public a ml model right uh and the color of the bubble represents if you look at the legend on the right hand side uh the the domain for which the machine learning model or the AI model was built right so you see things like if you start from the bottom computer vision speech processing recommendation system multimodal uh Lang language models games and drawing right so what you see is that if you look at all the way from the beginning of AI which is 1950s 1960s till about 2006 right you have typically got models which have been trained below one Petra flop right so that's one scale and you see kind of like a paradigm shift happening around 20072 2008 uh and I I'll run the simulation for you in a minute right uh and then you see a crowding of a bunch of models after 2008 2008 remember is also the time when deep learning paper gets the first deep learning paper gets published in the nature magazine which is a world standard scientific publication perhaps the top one out there and post that since 2008 in the last two decades OD uh you see a lot of models being trained with very high compet ation power right so the key message is that the computation which has been used to train the publicly available Ai and ml models U have seen an exponential rise uh in the last two decades or so I again emphasize that on the vertical axis you have the logarithmic scale one jump means one or multiple orders of magnitude so what seems like a linear relationship is actually a exponential relationship so there's been exponential growth in the amount of computational power that we are putting behind training and AI ml model right so that's the uh key thing let me also show you the source from which this is picked up uh it's a Fairly reliable source and uh something that you should be I would recommend that you use in your daily life also um I use it fairly regularly this is the website our world in data right uh and the one that we are talking about will be I will'll be using five or six charts here but this is the one that I wanted to show you right so this is the one that we have been talking about so on the horizontal axis is time and on the vertical axis is the computational cost or the computational power being put behind each one of these models so you can do interesting stuff with this simulation for example if you're interested only in looking at language models uh you can just select only the language models and you can see the language models start really U the 2008 Mark will kind of stay consistent that's where deep learning papers start getting picked up but you can look at if you're for example interested in vision and not in language U you will see that um this is what you can see with vision right and if you're of course uh interested in all of them then you can let's let's just look at one of them and you'll kind of get a flavor of how this field has evolved right so if you look at this in the initial few years you see once in a while once in five years models coming in then there 2008 you see a increase in model and then as you get earlier in the more recent times you see explosion of models all in the language Spain getting trained at 10 billion Peta flops right so this of course is uh computation being thrown at the same problem and more and more computation making models more and more powerful right so that's fundamentally what's happening um and that's what I wanted to kind of show you that the amount of computation that has been made available partly due to mood's law uh and competition becoming cheaper uh has been playing a very important role in AI models becoming powerful and this is a theme that you will see a lot of experts talk about that yes there has been growth and Improvement in algorithms but a large part of the models becoming powerful uh is a contribution of more and more powerful computers and computation being used to train these models and now this can be said in three or four different ways this is one way of seeing it where you are basically looking at the amount of competition thrown at a model I'll show you two other ways to look at to drive home the same point right but um the training competition is the first thing that you would be spending on now depending on the size of the model that you are training you may not require this uh but this is where the state-ofthe-art AIML models are so if you are even if you even considering uh training or building a llm model from scratch which is World standard the amount of competition that you will need is something that you should account for if you are if you have that kind of budget all power to you but this is something that you should be fairly aware of if you're going to build very very large models okay so that's one uh the second thing that I wanted to show you is that there is a reason that so much computation is being spent on these large models right one of the things that you will see is and I'll show you the proof of this uh is that the performance of these models does in fact get better uh as more and more competition is being thrown at them so people are not doing this for the fun of it you can actually measure the performance of a model uh and say that that there is a very strong correlation between the amount of computation that you throw at a model and its performance right so one of the things that you kind of need to figure out is okay how do I measure the performance of a model and in generative Ai and in large language models um you need the model to be able to do multiple good things right that's the nature of language models right so if you look at one of the most well established metric in generative AI especially for large language models uh it is this performance Benchmark called MML which stands for massive multitask language understanding right and what it is is it's it's a database of multiple choice questions right and these multiple choice questions are spread across many many subjects I think 57 diverse subjects and they cover a long a large part of human knowledge right and what you want to do is you want to basically say okay this time um uh I'm going to put the training computation which is the amount of computation that I'm throwing at my data on the horizontal axis so instead of time on the horizontal axis note that we have computation cost on the horizontal axis in the previous case we had computation on the vertical axis now we have the training computation and the horizontal axis and on the vertical axis we have the performance Knowledge Test which is basically what is the percentage score that you got across this diverse multiple choice questions which covers wide variety of ranges of human knowledge right uh and of course um each dot here represents a AIML model uh in this case it specifically represents a large language model uh the size of the circle is an interesting metric which measures the amount of data that was used to train that model now there are very lot of interesting things that are going on in this one single data and one single visualization so let me call out a few things that uh will make uh that that I think are very important to realize first is the axes itself right um now uh at some point if you go back to your stakeholders and you say that I need more money to build a more powerful system to build so that I can improve the performance of my model uh you will get this question of Roi return on investment that how much money are we spending in training our model in terms of computation cost and what is the return that we are getting right and this is this is a good way of capturing it where you have put the performance of your model on the vertical axis and on the horizontal axis you have put the amount of computation spent to train that moduel right and that way you can measure that is it worth throwing more computation at the model or not right the second thing I want you to realize is that As you move towards the right you also see the circles getting bigger right this is not a coincidence uh and this is one of the now well established themes that the amount of data that you use to train a model must be directly correlated with the amount of computation that you are using to train it let me say that again if you only increase the computation without increasing the data you will not be able to get the Improvement in performance vice versa if you have too much data then of course you will not be able to consume it and training the model if you don't have access to competition so the the fact that the circles are getting bigger As you move towards the right is not a coincidence uh and the amount of data that you have and the amount of computation that you throw at the model almost need to go hand inand right and of course the color of the Bubbles as the Legend on the right is showing you depends on who is the Builder of that particular model you will see some famous names you'll see open AI you'll see Meta you'll see Google hugging face Deep Mind Right these are all The Usual Suspects who have built these very large scale models on very large data sets Okay U let me just run this and then I will take the questions that uh we are raising um let's see this is the model why did it die okay um right this is the model the performance on right so let me kind of so this is the diagram in much more visual much more scaled in version zoomed in uh and I'll leave this on while I take the questions Rahul is asking does this mean that gp4 is high performance model however will also incur a high cost of compute after after training no R not this is training so so let's go one by one right um the high performance so this particular screen all so this okay so let's look at this particular screen this particular screen is basically saying that gp4 has been trained with a lot of competition on a very large data site and is one of the most well performing models on this Benchmark called mlu the fact that it has been trained with a lot of computation also means that a lot of money was spent on training it so that is the first part of your question the cost is already been spent by openi in training the model I'm not really sure what you mean by cost of compute after training uh as we mentioned in the previous Slide the primary cost of training a model is the primary cost of building a model is training it so when we say that 10 billion flops or 12 billion or 15 billion flops have been spent on this model we are saying that is the amount of computation that has been used to train this model now after the model has been trained and if you are using it to make prediction which is what happening today the mod gp4 has already been trained so if you are invoking it with an API or with Char GPT then also yes there is a expenditure to it right there's an expenditure to it which is which you are which you can potentially monetize by charging a fees that is called inference cost but right now I'm talking about training cost which usually tends to dominate the amount of money being spent on a model OKAY cin is asking in this chart it seems GPT 4 has better performance versus G Ultra despite Gemini being trained with Forex the paeta flops what is fundamentally different about jp4 that makes it outperform despite less training competition um okay so n these are all gp4 for example is a closed Source model so we don't know too much about what it how it works underneath the hood though we have a fairly good idea of what it is now to your question of what could potentially be making gp4 perform better one argument could be that the kind of data that gp4 has been trained on uh is more is richer or more diverse or more correct in some sense uh than Gemini so that's remember the amount of computation is only one of the dimensions the other two dimensions are the data sets and the algorithmic Innovations so at a 60,000 ft level the answer would be the data set and the algorithm that was used to build gp4 is perhaps better than what was used to build JP Ultra how exactly we are not privy to because gp4 is a closed Source model right any other questions before we move on yes uh Professor this is mano so if we have uh the model which is trained on lot of data like for example jp4 on uh 10 billion plus uh uh that we are talking sorry sorry up this is not data this is computation yeah computation okay so the model who is like heavy in in terms of data plus computation is it going to take more compute power when we use it after training like to host it what do you mean by model being heavy so that means like I have gp4 which is trained on on petabytes of data all over world data right wait wait wait let's go slowly um this light is this visualization is showing two things the size of the bubble is showing the amount of dat on that was used to train the model yeah I just zoom it and the horizontal axis is showing the amount of computation that was used to train this model there is a there is a third dimension which is the next slide that I'm going to talk about which talks about the size of the model I not gone to that slide so if your question is about size of the model you hold on to the ne till the next visualization sure this visualization is about computational power used to train the model which is another proxy for how much money was used to train the model and the data that was fed to the model and how those two things are comparing with its performance on the mlu performance Benchmark okay yeah sure right thanks sure any other questions yeah okay there are no other questions then let's go back and let's continue right so this was about model performance and how model performance does in fact correlate with the amount of computation that you are using to train the model right so there is a justification for um throwing more and more competitional power ATA model right now let's move on to the next one this was the question that Mano was asking this one has got to do with the size of the model right so what do we mean by the size of the model now you folks have not studied uh machine learning and you Al so so this term is slightly tricky but let me kind of use an analogy um in one of the classes which I believe Dr MTI talk took he talked about neural networks and deep neural networks and how these current neural networks and deep learning networks are modeled after the human brain right um and there is this idea that human brain is made up of neurons and these neurons are connected to each other right uh and there are more than 100 trillion connections between the neurons in the human brain uh one good way of thinking about the size of the model is the number of connections between the artificial neurons right that metric is also called parameters the number of parameters that the model has of course I'm offering a a one dimension of defining model size because I'm assuming that you are working with deep learning models if you're working with decision trees and linear regressions then the definition of the model size changes but most Large Scale Models today are neural networks and deep learning networks so I'm not very far away from the truth the size of the model for deep learning models can be measured by what are called the number number of parameters that the model has which is more or less equalent to the number of interconnections between the different neurons right so when somebody says I have a 8 billion model parameter what 8 billion parameter model what it is saying is I have multiple maybe definitely lots of neurons but what what he's he or she is telling you is that the interconnections between those neurons are 8 billion right so that is a measure of model size right and um the fact that the model performance increases uh due to the uh because of the computational cost the amount of computation you are throwing at it is also reflected in model size and Mano hopefully this should answer your question so now we will expand our statement that I made that not only is the data size of the data set and the amount of computation that you use to train the model strongly correlated the size of the model is also correlated right which means that the larger the data set the larger the parameters that you need in your model and the more computational power that you need right so these all these three parameters move in sync um there are some papers which actually try to establish an equation on how these three models parameters relate with each other but but they are strongly correlated so the larger the data set you have the more the bigger the model you need the bigger the brain you need to understand that data and more computation power you need to learn from the data and when all these three go and sync the higher the performance of that model right so that's what basically the key message right here is the big picture um on the horizontal axis note that now we have put the number of parameters uh which is how big your brain or how big the brain of the model is which is measured in the number of interconnections between neurons and on the vertical AIS we have put um the uh computational Power which has been used to train these models again the vertical axis is not linear um it is on logarithmic scale so if you see a linear relationship here between these two parameters it is actually an exponential relationship uh but the point Point holds the fact that the bigger bubbles are on the right top tell you that the amount of data that you need the number of parameters that you need to have in your model to learn from that data set and the training computation that you need to throw at the model are all strongly correlated they are all correlated uh let me kind of switch slides here and show you the this diagram in mode this is interesting these are interesting diagrams so we it's worth spending uh time on this I think the model that I'm looking for is this [Music] one yeah this one I already showed [Music] you where I yeah I think it's it's it's one of these models so you guys can look at it in more detail but um not able to find this particular I okay but but the point holds uh let me take the question and then we'll move ahead um there's a question from sesna saying is there a correlation between model training cost and model fine-tuning cost okay um good question I have not seen data which relates model finetuning cost to the model training cost uh strictly speaking my gut instinct would be that the model fine-tuning cost will be related to the size of the model and since the size of the model is related to the amount to the model training cost therefore the model fine-tuning cost will also be related to the model training cost so that's the way I would reason about it though I haven't seen data to prove or disprove my statement right um and this is the key message that by looking at these three graphs that I have covered so far that I want to you to take away that the size of the model the amount of computation that you throw at the model and the amount of data that you throw at the model should all be correlated you have to grow them all together if you grow only one or two of them without growing the third then you will not be able to get observation uh then you not be able to get the impact that you're looking for um Deepak is saying interesting observation in this diagram is that we see industry bubble is much larger visible even higher than academic does it mean most of the AI developments are happening in the industry great observation Deepa in fact I have four or five slides on this topic in this presentation itself so hold on to that thought process and I will have more to say about it that is on the macroeconomic side of what is happening with AI right this is on the microeconomics side right but that's coming it's coming in less than three slides um of course the last point before we move on from the computational cost is that we expect this trend to continue right for those of you who are familiar with something called moods law moods law basically says that the cost of computation continues to fall which means the $1 today um buys much more computation power than $1 18 months ago um in fact it says that $1 today buys twice the computation cost than it bought 18 months ago so every 18 months ago the cost of computation Hales because of advances in semiconductors manufacturing right so this trend uh has been continuing and it has continued much longer than anyone expected and it's still continuing so as the cost of computation continues to fall you can expect to see much more Improvement um in AI models right uh so this is a this again is a interesting diagram on the horizontal axis is time the date of release of gpus gpus are the chips which and they have been in the recent news in the last month gpus have been the news quite a bit um and this diagram has interesting things to see I'll just zoom in on a minute but first let's recognize that the horizontal access is time and the vertical axis is computation cost per dollar so the cost of computation how much computation can you buy for $1 right uh and as you will see that uh it has been growing significantly uh the vertical axis is linear the horizontal axis is also linear but um there's another interesting observation that this diagram brings up so let me zoom in here and then I'll call out why this is an interesting diagram right um let's zoom in here okay all right so the first thing the first thing to note here is of course as I said that the uh there's this this continues to grow right the cost of computation continues to fall or the computation Plus cost computation product per unit dollar continues to increase right now if you look at the manufacturer you see something interesting uh as all of us know uh Nvidia is the dominant player in this space you will see that they have been one of the key drivers of gpus and driving gpus of the chipsets as I mentioned to try to train today's Ai and deep learning models right um now if you don't if you look at Intel for example there's very little work that Intel has done in the GPU space and that's why they have been kind of missing the boat on the gpus and the closest computor to Intel is actually AMD who have done quite a bit of work they are not as news grabbing as Nvidia Nvidia is of course the market leader but the close following up or competitor of Nvidia is AMD right and that's what's happening on the chip side on the hardware side right um there's one question from Robbie what should be the deciding factor and guiding principle to determine on premises versus on cloud um okay so I assume that you're talking about the hardware infrastructure required to train deep learning models uh in general I think I think um there are several studies on this uh but broadly speaking you start with the cloud but once you have the size of a model Beyond a certain point if your model becomes too big uh and or if you have to train your model frequently enough um it starts to make sense to do it in house now this is easier said than done for two or three reasons one of course is that you need the human Talent to be able to train these models on Prem it's non-trial to build the on Prem data center with gpus where you can train these models so that's the human Talent you need to do this and the second is uh at least in the last 18 months or 24 months there has been a mad rush um and today right now one of the reasons Nvidia is doing so well is that their the demand for their GPU chipsets far exceeds the supply so there is a waiting time to buying gpus um and so even if you decide to do it in house it's not as if you can buy it off the shelf you'll have to place an order and wait a few months before you can get your hands on the gpus using which you can build this in house okay um any other questions before I move on okay right so um so this basically as I said the same message right so now that was on what kind of things that you should keep in mind when you are thinking about budgetary planning uh and this this that discussion is very very relevant if you're going to build very large models for smaller models of course the same principles hold but the dollar numbers will be smaller right but the principles of computation cost training data and model size being correlated with each other is is a generic statement which holds independent of model size that statement is in general true uh but the dollar numbers of course become much more significant as we as we increase the size of the model and the training data right now um the second part of the presentation I want to spend some time on today is something that deep pointed out right now um and this may or may not be relevant to you in the Enterprise directly but it is important important to know especially if you are going to be at a leadership role in the industry right um and and the the first thing that Deepa pointed out is exactly what I'm going to start with right that there is a move out from Academia into industry right so if you look at the history of the topmost AI models being built and say where are these models coming from are they coming from universities which are shown in green or are they coming from industry or are they coming from a collaboration of the two which is shown in dark blue you see a very interesting Trend right the trend starts somewhere around 2013 2014 and you start to see that the number of models coming from industry uh has continued to grow at the cost of the number of models coming from Academia right so that is in fact a very valid observation that most state-of-the-art models are coming now from the industry um it's a good question to ask why this is so now one potential explanation is that the kind of capital required to build to buy this computational power is significant right remember the computational power the kind of computational power we are talking about requires money to purchase and a lot of industrial houses like open AI Microsoft um and Google are putting a lot of the money that they have or meta are putting a lot of the money that they have in actually building these models right and few universities in the world in fact have that kind of money to put behind building these large models right and that is kind of a important Trend to be aware of as to the source of these models now a lot of these models are open source but the point remains that the source of these models is happening from the industry and I and this idea of Industry versus Academia is something that again we'll see in three or four different ways um but this has major implications not only on the tech industry or not only in the industry as a whole but actually on geopolitics and the world economy right what this means is that AI models are increasingly under the control of a few major companies right uh and said when I say under the control uh in some cases they are absolutely under the control in which because they are Co close Source model gp4 is a good example it's a closed Source model you do not really know how it works um and uh you kind of know that it is a deep learning model you don't exactly know how it works and it is controlled by open Ai and Microsoft to some extent but even when the models are open right even when the models are open source like meta has released a bunch of these open source models the kind of comp computation power you need to run them you will end up going to one of these major large Cloud providers like Microsoft or Google or AWS maybe so the control of the large big Tech Giants is going to increase as a i becomes more powerful right um and U this has implications on the economics at the global scale um one argument is that it's going to be continue to be used for profit and uh as you will this is slightly dated slide but in 2024 we have already seen this AI regulations have taken Center Stage because once you have too much power aggregated in few players regulations STP in and try to make sure that there's no abuse of power uh this is one of the reasons that governments around the world um are waking up and saying we need to control this in fact in the last month or so we have we have seen discussions about something which is called Sovereign AI which is a play that Nvidia has put into um discussion and Nvidia is basically arguing that each country should own the AI models that they use uh and that's that's interesting which basically is uh where basically they are talking about saying that the models need to be owned deployed and controlled at the country level uh so that the so that the Monopoly or the control of the large tech companies can be checked checked in right um the other Trend that you see I think this is to cortland's point is that even the human capital is now moving towards uh industry right and this is the graph which tells you the story so if you look at the horizontal axis it's time right uh we are moving from 1950 to uh 2022 and what we are showing here is uh the teams which have built the topmost models where did these teams res side right so you're seeing Green in Academia and blue dark blue in industry and if you look at the dark blue the around 2014 again 2013 2014 2015 you will see that the dark blue becomes the dominating U color much more above green and light green and dark green again emphasizing that most of the human capital or expertise is now sitting in the industry right um the research Publications is uh the research Publications are increasing from the industry the human capital is moving and again this emphasizes the message that AI is going to be controlled by the tech companies and regulations have to be carefully thought about right um another interesting trend is that the number of people being hired by industry versus Academia right so you will see here that on the horizontal axis is time and the vertical axis is number of phds in AI staying in Academia versus going into industry and you will see again that around 2015 we see this trend where we have um the industry starting to absorb many more phds um often at the cost of Academy but Academy more or less has stayed flat but more and more phds who are graduating are moving towards the industry and that's kind of goes to the cotland Viewpoint of saying that the they are paying much bigger salaries right and all the messages get emphasized again right that it's going to the same set of half a dozen or so companies which are building these models okay um uh whether you look at Investments you will see the same Trend right the amount of investment being made corporate investment being made whether it's in acquiring companies or whether it's in private investment from VCS going in or in small small Stakes uh there is a growing Trend uh in more and more money in billions of dollars being put into industry so this trend is going to continue to grow right now um there is the last aspect of it that I briefly want to mention uh which I thought was very interesting that we talked about Nvidia versus AMD versus Intel as a supplier of GPU which is a corporate way of looking at who is selling these right but there is also a geographical view of looking at this uh and this is very of very rarely discussed in in media but at your level of seniority you should be aware of this that there is a geopolitical angle in this that the production of these gpus is limited to a very few countries actual manufacturing right now there is if you look at design most of these things get designed in the us as you would expect almost more than 60% it is in design is dominated by us but when it comes to fabrication of these chips then you will see Taiwan becoming the dominant country where GPU chips are fabricated assembled tested and packaged um and the fact that Taiwan has become so important uh geopolitically is of course there are many reasons for it but now I believe this is this is definitely one of the contributing factors that GPU chips and Powerful CPU chips Taiwan captures more than half the market in the world right so between Taiwan and us you have virtual control as you would expect China and South Korea follow close behind and then there is the long tail of a few countries participating so that tells you a lot about the geopolitical impact of AI and um the supply chain impact of AI right I mean if you are if you kind of are able to coordin off and cause supply chain disruptions um then you can actually impact the progress of your competitor country in in AI in the space of air so there are many many implications of that some of them not correct to be discussed in a public forum but um you can make your own judgments about what's going on and something that you should be aware of um parag is saying probably that is the reason behind offering DBA versus PhD in the course that the industry are generating more research paper in a space due to vast data set ability that can okay all right um let's move on okay so that was the first part that I wanted to talk about which was computation right and the economics of AI from the perspective of computational power data size and uh model parameter right um the second thing that I want to focus on is the economics of data and how or the how how data impacts the economics of AI right um and this is also very interesting topic so let's start um you must have heard the term economies of scale right what we see with data in some sense is what we call dis economies of scale right so what does that mean right now let's start very small uh and start at the beginning as you would imagine uh anytime you have to build or create a data set there's a cost to it right so there's a cost to collecting data there is a cost to processing data bringing it in certain form and there's a cost to maintaining data this this is obvious right now what is not so obvious is is that the way this the economics of this is actually fairly counterintuitive right so as you start to collect more and more data the marginal cost of collecting and maintaining data actually falls right so it gets cheaper and cheaper to collect process and maintain data per unit of data if you look at just volume so so this is economies of scale right which makes sense the more data you collect per unit cost or the marginal cost of an additional unit of data is not that much it actually falls what happens why do we call it the diseconomies of scale is the marginal benefit of adding a new data so the value of adding a new data point in your data set actually false right and both these statements are counterintuitive so I'll repeat them once more that as you continue to collect more and more data the cost of adding and storing one more data point reduces but the value of adding one more data point declines and the value of adding one more data point declines much faster than it become than the cost so the effective value of the data point is not there right I hope that kind of makes sense um I'll wait 30 seconds so that you understand what the statement means it's a very interesting statement so if you have questions ask me otherwise I'll move on you can read the line twice and hopefully it makes sense how do we say the cost of storing goes down the marginal cost which means what it stores to okay so here is the way to think about it what it takes to store the first megabyte of data let's say is $1 the cost of storing the second megabyte of data will be 90 cents per megabyte total will still be 1 .9 but the marginal cost of storing a unit of data Falls does that make sense yeah okay yeah so Deepak how the value is declining is coming in the next bullet point is there inative or quantitative way to determine this Cliff either tradeoff in training versus um yes and here is the thumb rule right here is why this is interesting and that's I'm going to spend the whole next 45 minutes talking about this this is a very interesting topic the value of data if you quantify the value of data in how that data impacts the accuracy of a machine learning model or an AI model that is the sense in which I'm measuring the value of data the reason the value Falls is because it's easy or it's relatively easy to build a model which gives you 75% accuracy let's say right it is much harder to build a model which in it is much harder to improve the model to increase its accuracy from 75% to 85% it is much much harder to increase it from 85 to 90 and it is an OD 10 degrees more harder to improve that from 90 to 93% so as your model becomes better and better and better and better increasing it that one step further takes much more effort right and if you say this in data terms you'll get to a point point where doubling the performance of your model initially might by throwing more data at it might be easy but after a certain point if you say now I want to double the performance of my model again you may actually need 10 times more data than you had in the previous step right so when I say the value of data I mean only in terms of the value of that data in improving the accuracy of your machine learning or AI model so the example that I gave that getting the first 70% 60% performance is easier but as you continue to make your model smarter and have more predictive power the amount of effort the amount of money the amount of data that you need has to be much much much much more which means that the marginal value of data is falling one megabyte of data is less valuable now because you need much more megabytes of data to get that same Improvement in model performance does that make sense so um Deepak your question is um I can't agree or disagree with this um see there is no way to say what is a good what is a good accuracy Benchmark there is such a strong Reliance on domain that if you have a model which can predict with 60% accuracy whether the whether the stock market will go up or down when it opens tomorrow that 60% is great you can make billions of dollar if you can beat 50% right because all you if the only thing that I'm asking is tomorrow will it open higher or lower at the opening price 60% is great 60% is horrible if you are trying to do cancer detection right there you need 99% plus um so the percentage accuracy which is quote unquote acceptable great or bad is very very domain it um Robbie is asking how the value is defined in this case uh I think I answered it Robbie did you get the answer so my question professor in this case is if if we are saying the original value of the data is declined because the next model to be trained will require tons of data but then it will depend on the time availability and other factors to generate that data uh let's say for cancer detection versus the original intent of the data that we you know initiated the generating the model with so is value really subjective to the three things that you talked about earlier which is time computational and and availability or and and in that in that case this disomies of scale is implied or are we say the value is is is pretty much uh consistently declining because now the data required will be much more so first of all the previous section that we talked about was talking about computation cost model size and model performance um not about the cost of data right so I'm not making any statement about how costly it is to produce data that is okay so let me this line the cost tends to decrease over time right that is a separate statement from saying that the marginal benefit so when I say value of data I should change my to say the marginal benefit of additional data points in your model is lower right instead of saying value of data which seems to disturb some of you the marginal benefit of an additional data point for your model will fall does that make sense right right yeah it makes sense okay yeah okay okay Nick has a long comment it is unknown how long it will take the model to train and how we and we don't know how long the model will remain relevant until it needs to be retrained the dis economies of scale makes the balance sheet a bit more complicated yes it does from a cost perspective it becomes difficult to do an Roi calculation uh how is this issue managed by VC groups looking for a reasonable return on their investment uh great question Nick um a16z which is one of the leading Venture capitalists in the world has a series of article on this they are referenced here at the bottom right um and if you're interested in this topic I encourage you to go and read a16 the articles they have written extensively about this and some of the material that I've picked up is actually from their blogs but um it's a very very difficult problem for VCS um and what I'm covering is probably only 30 40% of what all VCS need to worry about before putting in money in a AI company but valid point it's non-trivial problem and um they they look at various things right of course as you know if you if you're asking the question I'm sure you're aware of the startup World VC's look at a bunch of stuff right they look at not only where you are but they look at whether you are aware of what you are getting into uh and my objective today is to at least expose you to that that you should be aware of what you are getting in into where will the cost go right as I said the cost will go in computation it will also go on data and I'll say much more about it over the next 30 minutes uh but data itself is a huge huge problem um uh and and there are no good mathematical formulas to come up to what you have asked nak it's it's just pointers and heuristics at this stage okay um Sati is asking is there any relationship between model parameters and data volume when data increases do we need to to increase the model parameters uh Sati yes if I go back to a couple of we covered this in the slide deck on the side in the section we were talking about computational power uh I think I showed you that the parameters of the model and the model size are strongly correlated and not only are these two correlated the amount of computation that you need to train a model with large data set and with large number of parameters is also higher so all these three parameters are highly correlated uh sashan is asking is there a way to calculate data asset value uh uh I mean I think it's the the answer is fairly obviously no uh because there is no way to even calculate the value of your model right so first you have to be able to quantify the value of your model and if you are able to do that then arguably you can calculate the data asset value by creating a relationship between the value of your model and the amount of data it has been used to train it and again I encourage you to go back and look at the um look at the charts in the first section that I covered and you'll get hints to this right for example your question to thean is the answer is one of the previous slides um that the the bigger bubbles actually let me show it to you um since and the these are very important charts uh they don't seem they seem fairly obvious but but if you think about them deep enough you will be able to get the um answer that the slide that I wanted to show to sasna yeah sasna this is the IM Manu you had a question okay so Das this is the slide which answers your question I will draw your attention to the fact that the vertical axis is the performance of the model and the size of the bubble is the training data size the fact that the largest bubble is at the top uh answers your question that larger data sets are more valuable than smaller data sets uh and if you were to create a different graph in which case the horizontal axis would have been the size of the data and the vertical axis would have been the performance of the model then you would be the slope of the line would give you the D data value does that answer your question yes Professor thank you so much yeah it does all right so let's move [Music] on okay so this is something we talked about we talked about industry versus Academia cost of capital we talked about geopolitical implications okay all right now the last thing that I want to bring in here which is perhaps the most complex part of this data discussion is that these dis economies of scale are made worse by the fact that a lot of real world data has a longtail distribution right now I'll try to explain this in the next 20 minutes uh it's a fairly unintuitive topic if you see it for the first time but it's important enough that I thought we should talk about it right here is what long tail looks like right so let's spend a minute understanding what this chart is uh this chart is about search U but as you will see in the next slide this chart appears in a lot of places in real world data sets right so what you are seeing here is on the horizontal axis you have the keywords that are used by people when they use a search engine okay let's say Google search and they are sorted in the order of most frequent words are plotted towards the L towards the left and the rarer words are on the right hand side of the horizontal axis and on the vertical axis you are plotting how many times these words were searched in a month right so if you look at the leftmost dark blue pillar you will see that the top 100 keywords are searched millions of times okay as soon as you get to the next top 500 keywords remember you have gone from 100 keywords to top 500 keywords now even though I've increased the breadth of my pillar or bar graph the next 500 words together are searched only 100,000 times so right of there you see a significant fall in how popularity changes right so top 100 keywords millions of times top 500 keywords 100,000 times and then top 1,000 keywords only 10,000 times and then you see exponential fall right and then as you go towards the right you will see that the vast majority of keywords are searched a few 10 times maybe a dozen times right now this is a very very challenging problem because as you would imagine the bulk of the revenue of a search engine which comes from ads would come from the blue part of the graph because this is the common keywords which advertisers will buy or would like to advertise to so even though the bulk of the revenue comes from the blue part of the graph the blue part of the graph actually constitutes less than 30 % of all keywords 70% of keywords actually lie in the orange part which is something that the search engine is making a loss on but they must serve this 70% of the tail otherwise even the blue will go away right so even though the value of serving the long tail it's extremely expensive to serve the long tail you must serve the long tail to monetize the blue part of the graph right now as I said this is for search but this is a trend that is very very common in a lot of businesses right here is another example right um so 43% of Amazon sales come from the green part of the graph and 50 57% of Amazon sales come from the long tail right so almost equal split if you may 50 50 but the long tail is very very expensive because it's very very long the number of titles that Amazon must hold on to is very very large but 43% of the book sales will come from a very small subset of the books right and a lot of real world data sets follow this longtail distribution this is basically an indicator of how diverse human choice is right the problem is that this longtail becomes a big challenge for machine learning and AI models broadly speaking if your data set has a longtail distribution it is much much more expensive to build a machine learning model or AI model as compared to if your data set has a normal distribution which is the nice distribution longtail is very very challenging very very expensive here is another way of looking at it think about chatbots right so if you're building a chart bot using gen and the chatbot is supposed to answer only 100 questions you can build that chat part very cheaply but if you want it to be able to answer any question which is what chat GPT is where there's a long tail of what users are talking to chatbot about so the prompts follow the long tail then it is very very expensive to build right so this trend is that you will see this is what makes it very very hard to build good machine learning models and makes the dis economies of data even worse the fact that you have to solve for the whole thing otherwise you will not even be able to monetize the top thing right that's that's the part that data creates a challenge for right so the problem is as I said the machine learning or AI models they tend if you can build a machine if if you given a problem saying build a machine learning model which works on the blue part of the graph [Music] only that's a much easier problem right but if somebody says no you have your model should be able to serve the long tail also that's a much harder problem in fact if you look at the mathematical equation which describes this graph the horizontal axis goes on till Infinity it never goes down to zero right which means that the diversity of data that you have to serve is not practically infinite it's theoretically infinite right which means every time a new keyword or a new search or a new data point comes and you are expecting your model to perform well think about what that means for something that we discussed in the last session which is continuous [Music] learning right every time a new data point comes and you are still supposed to continue to perform well you must retrain the mod right and these edge cases will continue to emerge and these cannot be ignored because a they are missed customer opportunities B they can create a bad user experience and see they can that can actually lead to people losing out right and this really is at the heart of how software engineering is different from Ai and ML and why B why AI ml models go over budget right and often tend to go over time right software engineering if you come from the software engineering background software Eng in is a very well defined process where you have a requirement you build a product and you ship the product or you ship the project AI development is a continuous process right the there is the developer can pick an algorithm it can pick the data and it can throw the data on the algorithm and build a model but whether the model will perform well or not is a very difficult question because you are fundamentally asking a question about whether the model is able to describe and the real world and able to adapt with it right and very often as you know the real world is much more Messier uh and that creates a huge problem for a models because what you're asking them to do is to create a model which works in all kinds of scenarios right so this is if you are going to undertake an AIML project or a product keep in mind that the costs can get out of control very quickly the time can go out of what you had planned very quickly and it's really nobody's fault uh because the real world data is complex um it is longtailed or heavy tailed and to build good systems you must be able to solve not only for 80% of the customer but 95 to 99% of the customers which makes this a very very challenging problem right and this has basically led to a movement towards what people are now calling data Centric AI right what that means is that if you look at the trend till about 10 5 to 10 years ago if you look at the 221s a lot of focus was on which algorithm or which machine learning model will you use right and there was a lot lot of research both in Academia and in Industry on creating better algorithms or better models and the data sets were pretty much fixed what has happened in the last since 2012 2013 is that the focus has now changed to saying well we know that these are the best there there have some dominant models that have emerged a lot of them are in deep learning uh but other ones also are now fairly robust please established for example XG boost or random Forest so there are a dozen algorithms which are kind of now known to do a good job across all the models that have been explored in academic research and in Industry so the model research problem is less there I would say there's still a lot of work happening there but there's much more focus on the Enterprise or industry side on data Centric right so whereas the team or the human resources and the human Talent till in the 20 210s and even before that was focusing on the model the focus now is on engineering the data to build great systems because it's the model is kind of more or less a solved problem now or if it's it it's research is happening in a few circles but in the industry and the Enterprise elements it's mostly about how do you build great data pipelines great data engineering teams which can feed these models continuously right so the focus is on data centricity to improve both the quality and quantity of data for solving problems like long tail which I talked about in the previous slides uh making sure there are no duplicates making sure that the data is not biased right um we won't get into bias today but data biases itself even yesterday and day before yesterday there was a controversy with Google Gemini stol text to image generator uh which was which was impacted by bias in the data data set so these are the real problems that uh industry is now working on not the model itself right data Centric AI you need a definition basically breaks down into three sub themes one is developing the training data which is how do you collect and produce high quality Rich data to support the training of machine learning models uh how do you develop ways to evaluate your model right uh and this again is a tricky subject there are standard metric is like pration and recall but um it's depending on the domain you may want to develop specific test data sets on which you can test the performance of your model and of course data maintenance which is as data continues to grow how do you continue to ensure the quality and reliability of the data right um uh yeah so I'll skip this I'll move on here um I think it's 7:49 before instead of starting a next section uh let's take a 10 minute break let me just pause here and see if there any questions I can answer and then we'll go on the break sorry I can barely hear you can you speak closer to the microphone can you hear me now [Music] no hello hello hello yeah yeah this is better better yeah so the data Centric uh model is more like a domain specific like okay if we are restricting the domain uh then the data Cent will be more uh beneficial and cost effective to to create such model like like say for example the medical domain or any oil and gas domain we are restricting the data to a specific domain that means like it's we are making it more data Centric is it the right understanding no so what you are talking about is domain focused okay right domain focused and data Centric are two different ideas let me spend a minute explaining that data Centric says first whatever problem you are working on right and you might be working on a problem statement which is relevant to your company only or you might be working on a problem which is relevant to your industry or you may be working on a problem which is global irrespective of where you are working independent of where you are working the large majority of your focus would be on building a great data set so that you can train one of the well-known you can use one of the well-known algorithms to build a model so Focus your time and energy on collecting the right data set from which the model can learn and this statement holds independent of whether you are working on domain whether or whether you're working on a company specific domain specific or Global problem so that's statement holds independent of that does that make sense yeah any other questions yeah so sash is absolutely right and there was a slide in I think three or four weeks ago where I made the point that in today's AI an ml models 80% of the time is actually spent on data and training the model itself takes 10 to 15% of the time okay if there are no questions uh we'll take a pause it's 7:52 we'll take a 10 minutes break and we'll be back on 102 where we'll continue our discussion on um on the economics of AI we have talked about computation and data we'll talk about two other things today algorithms and models and then we'll talk about the human part of this or the human capital part of this so see you in 10 minutes all right good morning good evening everyone welcome back um we'll continue our discussion on the economics of AI um so what we have talked about till now in today's session the first part was about competition uh and we looked at how when we start talking about the amount of compute you need to train a model uh that discussion very quickly leads into the size of the model which is the number of parameters that the model has which then very quickly leads into the size of the data set that you have to train the model on and we saw how how these three things are correlated and if you increase all three then you can get higher model accuracy and we saw them both from a historical perspective Evolution over time and in terms of correlation with each other right so we saw that um we looked at was did you have a question you're you're unmuted okay uh the second thing that we looked at was how data impacts um the economics of here uh and we looked at of course that the data size is an important metric how big data is but it's perhaps a misleading metric uh uh the reason it is misleading is because of the way machine learning models performance is related with data size uh the marginal value of the marginal impact on the accuracy of a model for every additional data point is not linear right and it has a dis economy of scale and this problem is made worse because of real world data distributions which are longtail which makes the problem worse right now the third thing that we're going to talk about today is U the model part or the algorithmic part and what are the considerations and implications on the economics when it comes to this right and this is something that we need to Rec nice I had a previous slide on this but remember that once you have the data you have the compute kind of accounted for you then rely on AI developer a machine learning engineer a data scientist whatever you call it to explain the real world or predict the real world using a model right uh and this as we mentioned is a continuous effort we talked about it during continuous learning um and it is important to recognize that this is a Perpetual problem and not an easy problem because the world keeps changing the world is fairly complex and trying to model it using any kind of a model is a non-trivial problem right so uh call out that it is uh to recognize that this is a hard problem and likely to be something which stays complex um having said that the first first principle when you look at building a machine learning or AI model is to keep it simple right and that's what kiss stands for it's acronymic for keep it simple stupid which basically says keep the model as simple as possible if the model if the problem that you are trying to solve can be solved with a simpler model don't use very complex model right now there are several reasons why for why this is said the simpler models are interpretable which means that you can understand why they are making a prediction like that right uh they are reasonably easy to scale and they are cost effective right so the the first principle even though we said the world is very complex um it can be very computationally heavy to build large models and so on so forth the starting point of of your journey should be to start with simple models right now the reason that we don't jump into sophisticated models and there is a tendency among people who start in this field uh who who are starting off to immediately jump to say let's build a neural network or a deep learning model to solve this problem which is very tempting because it's cool and a shiny new toy but most experience people will start with saying start simple and see if you can make it work with the linear model uh then stick with the linear model because there are complexities some of the complexities are the sophisticated models or the large models are expensive to train they're expensive to maintain right there are situations where they can actually perform worse than simpler techniques uh this is for those of you who have studed a little bit of machine learning and data science this happens with something called overfitting right uh but if you have not heard about it but just keep in mind that just because you have a larger bigger more complex model does not always mean that your performance your model will outperform a simpler model that statement is not true right um another way of saying this is that more complex sophisticated V models tend to over parameterize on small data sets right they can also Pro produce what are called fragile models right fragile models are very interesting problem in in AI ml which means that these models 10 May do very well on the test data that you have but once you deploy in them in production their performance may fall down very very quickly right um and this problem is much more common with complex models large models like deep learning than it is with simpler models or linear models uh so again many many reasons to keep it simple uh having said that uh this is this guideline is exactly what it is it's a guideline uh it's not a rule right and that's what the picture on the slide is trying to show and there are two ways of looking at it the the simple way is to looking at at the top seaw diagram where basically uh it's saying that Simplicity and complexity uh are a trade-off and you need to hit the sweet spot for your data set right uh you can't keep your model too simple which is the Seesaw on the left hand side side and you can't make your model too complex which is the cesa on the right hand side you have to just make the right level of complexity for your data set which is somewhat of an art right and if you have worked on machine learning and data science the graphs below will make sense to you where you have a data point shown as a scatter plot and a linear model on the left panel is too simplistic and a 5 degree polinomial or a six degree polinomial which is on the right hand side panel is two comp indicated is an overfit and a quadratic model in the middle is just about right so you have to find the right complexity try to keep it as simple as possible as when it comes to algorithms and models is the first guideline okay Seline has a question she's asking if rack techniques are used and gen is only querying a fixed set of documents will there be longtail challenges are there any situations where longtail issu isues are not issues for J um so let me take the first question even though the set of documents is fixed Seline the users's queries are not fixed right you cannot control the behavior of what the user is asking uh the query set that the user might throw at your model will also have a longtail distribution uh even if the number of documents that you have for your rag is small so you can absolutely have a long tail of user queries even on rag based solution these are two different sides of the equation if you may a rack technique is on the supply side or data scientist side or the machine learning engineer side of the discussion as far as the user is concerned he or she doesn't really know whether you are using rag or deep learning or decision trees or linear models he or she will continue to behave as they behave and as long as there is a user interface the kind of queries that they throw at you can very likely follow a long TR um your second question is are there any situations where long issues are not issues for Gen okay so this in general is these negative proofs are very hard to handle um and the best I can say is I have not seen it does it mean that there is it's not possible I don't know but most gen Solutions will suffer from longtail issues primar because their interface tends to be a text based inter space so minute you expose a text based interface to a open public interface in front of the users their queries will tend to follow longtail distributions okay so dasna is asking technically can we say that whenever there is semi structure data un structure data there will be longtail issues um I'm not sure whether longtail can be attributed to only semi structured or unstructured data and why it should not be attributed to structure data and counter example this would be Su of um the pro the Amazon example right if you look at the product purchase patterns of users which are structured data right because the portfolio even if it has 4 million products it's still structured data even there there will be a long tail behavior of customers in terms of what they buy so even in structur data you you will have longtail issues thanks Professor thank you um Jack is asking can you discuss fragile models further sure um The Fragile model idea is that the performance of a model just let me first Define what a fragile model is a fragile model is one that you have trained it on a certain data set and on that data set the model performs very well but when you deploy it in production then its performance Falls significantly we say that such models are fragile uh which is another way of saying that their performance is very strongly dependent on the data set distribution or the distribution of data on which the model is trained and even slight changes to the the distribution will lead to the model being fragile uh yo is saying fragile models implies overtraining I think you you mean overfitting uh yes uh the fragile models is a very strongly related concept to overfitting um and I won't go too much detail into that except for saying you that overfitting itself is a huge topic um but overfitting certain let me just finish by saying certain machine learning algorithmic families are more susceptible to overfitting than other algorithmic model families okay all right so so uh let's continue we're talking about the economics of machine learning models and AI models uh and uh what we started with by saying that please keep it simple do not jump into very complex models on day one it's neither desirable nor cost effective okay so that's the first principle to keep it as simple right the second principle is something that we have already talked about your life does not end when you deploy the model right uh you must continue to train your model right and uh this point about continuous learning which we spent a whole session on becomes even more important when the data set that is is coming to you is fairly heterogeneous which means that even though you trained your model on a small on a certain size of a data as more and more data comes in since the heterogenity of the data is very large your model might very quickly degrade in performance because the new data that is coming in is significantly different from the data that we trained the model on and this will happen with long TR dist data distributions so the idea that you have to continuously train your model uh becomes a very very important idea right and that's what the picture is showing here that on the long teril you can shift the performance of your model right by adding more training data or by adding more diverse data right from the long table so both of these are examples of how you can improve the performance of your model by moving the model by increasing the data set on which your model is trained uh by increasing it by moving that blue yellow divide that we saw in the Long training either by moving it right or by picking certain points from the long tail which might be more impactful from a business perspective you can improve the performance of your model right so the first point was keep it simple the second point was if you're going to be data Centric I pick data points from the long tail to improve the performance of your model um question from hansu I've heard Tesco uses a real time cting model called rolling ball for product segmentation is that an example of such a continuous learning model I'm sorry I'm not aware of what rolling ball model is I've not heard of it so I won't be able to comment on that okay um let's move on yeah uh there are other ways of uh make continuous learning uh you can fine-tune the architecture of your model or adjust the hyper parameters all those will be covered later uh the second part of this this the third thing when it comes to the economics of of the algorithmic or the modeling part of AI is actually a user is actually a user experience or a customer experience or a design Centric Focus right um and and this um this idea I definitely wanted to put in to make you realize that even though your focus might be on solving a problem of making sure that your model performs better a data Centric approach to AI expands the your scope to not only the model performance Improvement but looking at the overall Pipeline and saying where is the data coming from and where is the longtail of data coming from specifically and can we actually make cut short the long tail uh by making some optimization in the user experience itself right and one interesting example is um when you are looking at a search window the autoc complete feature right now in the absence of autoc complete and this is a case study I think somewhere I read this that before LinkedIn implemented autoc complete there were 177,000 different entities referring to IBM IBM capital capital I small B capital M International Business Machines all kinds of terms were used to refer to this entity called IBM right but by making an interesting tweak to the user experience by by adding auto complete they improved the complexity or they reduced the complexity of what the model had to recognize because auto complete automatically completed something like IB to IBM or International Business Machines all of them were internally mapped to the same entity uh so you can actually kind of relook at where the longtail of the data is coming from and can you make improvements in the user interface or the user experience to actually make get the user intent captured and shrink the long tail to actually make it a much shorter tail right which can have a huge impact on what your model can perform for right you're basically getting rid of the user error and saying that the large part of the long tail is actually user error right of course this is not true always but there's something to be aware of that it might be a fluke long tail coming from challenges in the user interface uh so that's something that helps um the third the fourth thing that I would say about um ml models andl models in practice and economics of this is that building a single ml model for your problem is not always the best thing to do very often it is actually better to break your data set or to split your data set into small cohorts and train one machine learning model or one AI model per cohort uh that approach is very widely used in the industry uh for multiple reasons right so one of the things that drives this decision is to ask yourself whether your data set on which you are building the model is it consistent or is it homogeneous right or do the use users behave similarly uh do all your users behave similarly right if your data set is homogeneous one model is a great idea and it is the default idea to stick with because again keep it simple one model to maintain one model to retrain and huge impacts for um in efficiencies however sometimes it is not advisable to build one model and as I said this is the dominant factor which determines this is how heterogeneous is your data right for example if the if you have very different behaviors across different customer sets or different regions different geographies or other kinds of ways you can split your customers in and one customer set or one region behaves very differently from another region or another customer set it make sense for you to build one model per cohort or one model per user segment uh this can this can not only improve the efficiency of each model but can also make sure that each model is simpler easier and cheaper to build right so that's something that you need to be uh very very cognizant of okay when is asking can we extrapolate this to a use a more narrow ml model for the head and torso and use a broader model for long tail um I'm not sure I understand the word narrow and Broad narrow ml model and Broad ml model V do you want to elaborate on that phrasing yes Professor this uh goes back to the graph you had shown a couple of slides ago where you have like a chunky middle and a narrow head so that's that's that's on the data set that's a data distribution oh okay got it uh then please take that off I got call it wrong yeah uh so that's a data distribution and right now what I'm saying is that you might want to build uh if your data is homogeneous and a longtail distribution by definition is heterogeneous right uh and this is what exactly the point that I'm making that if you have a longtail distribution then you might want to build one model for each cohort and each model addresses a slice of your data right and that can really help improve the cost efficiencies of the economics of AI models the argument intuitively is that if you have different models or different behaviors to model or different complexities to solve for then the smaller problems are perhaps easier to uh solve for than one Global solution but of course building slicing your data set or slicing your customer base into cohorts uh brings in the requirement for domain expertise where a domain expert will tell you that this is the way to break down your customer set or data set into logical cohorts because they behave differently uh himansu is saying Ensemble models like random Forest uh no himansu this is not Ensemble modeling that is not what we are talking about uh here the in Ensemble modeling what you do is you keep the data set fixed and you train multiple models on the same data set and then combine the results this is in some sense a different approach where say break down your data set into run a clustering algorithm for example on your data set and for each cluster build a different machine learning model right so this is the exact opposite of ensembling where you are building a machine learning model for of course for each subset of your data set you can build another Ensemble but that's a different discussion right so that's what it is saying use clustering techniques uh to split your data set into logical cohorts or logical groups and then we'll build one model and this can be much more efficient so the example that I have here is for example you are building a bot detection algorithm now bot detection seems like one problem but it is in fact six different problems because there are different types of B for example the search crawlers from Google are Bots data scrapers which scrape data from your website are a different kind of Bot there are Bots which scan which attack the ports the HTP ports or the TCP boards so they're different kinds of Bo um Bots and they behave very very differently in fact they work at different layers of the um IP stack so building one single bot detection model is probably not the smart thing to do instead you should perhaps build six different B detection algorithms one for search crawlers one for code scanners and then try to on the incoming data have each one of these B detection algorithms run and if any one of them Flags it we call it that kind of a bot right so so you need to kind of have a system level approach of thinking about saying will one model really solve my problem or is the problem so complex and the data set to so diverse that I actually need to split my data set into homogeneous subgroups and then build one model per subset of the data right and this is as I said a very very common approach in production level systems whether it is for fraud detection or loan underwriting or content moderation U very often each one of these does not use a model it uses many many different models okay um this I will skip okay the the last part that I want to talk about today um is when we talk about the economics of AI we must address the issue of its impact on humans uh what we call labor economics or human capital right um and this is the thing that we will end with today uh a lot has been written about AI taking away human jobs and the role of humans in the age of AI uh it's an ongoing debate it's a rich area of research so I pulled together and some interesting studies that I had read uh and then we will finish with this I thought the one that I liked which was done most thoroughly uh was a study that came out of a collaboration of Howard business school and Boston Consulting Group right uh let's start here and this is a research paper the references at the bottom uh what this basically talks about is what they did was they actually uh took a fairly large data set about 750 odd consultants and these are all management Consultants working at BCG which is a Consulting Group uh I think based out of Boston if I'm not mistaken and that's a reasonably large data set just from a percentage perspective this is about 7% of the task force of BCG so a good data set and what they did with this was that uh what they wanted to measure is that they had a control group and they had the test group so half of these 758 Consultants were given access to char GPT the professional version or the paid version which is GPT 4 based and the other half was not given access to chat GPT so you have a control group you have a test group and this group these two groups were given 18 tasks and these 18 tasks were actually not toy tasks but uh Consulting level tasks right this is what uh sorry somebody's asking a question about the previous slide let me just address it before I move on shant is asking so when we use multiple models trained on different data real time will the data sent to all the models and one of those model respond or will the data be filtered and sent to the right model well that really depends on how you implemented frean uh you can do either of those things in the first approach you can take a realtime data point and send it to all the models and ask each one of the models to make a prediction on that or you can apply your clustering algorithm on an incoming data point and say it falls in cluster three so I'm going to send it only to model 3 uh there is no generic answer to that it really depends on how you will Implement that okay um any other questions on the previous slides before we we continue on the discussion of the role of human capital okay so let's continue as I was saying this study was done by Harvard Business School and Boston Consulting Group uh the the group on which the test was performed was 758 management Consultants uh and they were split into a control group and a test group the control group was not given access to chat GPT the professional version they were not given access to chat GPT period and the other half was given access to chat GPT right and the both the groups every member was given 18 tasks to perform and these tasks were split across different categories for example they were given creative tasks where they were asked to give 10 ideas for uh targeting a shoe at underserved Market or a sport uh analytical tasks where they were asked to segment the Footwear industry based on user profile they were given writing and marketing task where they were supposed to draft a press release for marketing uh they were given persuasiveness tasks where they were asked to write memos detailing a business case study kind of like what you guys are doing with um assignment to right that's a pursu pursu task um and what what they found is what I will talk about but before I go forward realize that all 18 tasks are actually fairly represent of what we call high value White Collar jobs today right and that reflects in the data set and what they were trying to see is what will be the impact of introducing AI to uh in a company like BCG right so that's what's going on so now let's look at the result this is the setup right so the key idea the Highlight was that AI improves productivity this is something that we have heard several times now but what is that mean right so here is what they said they looked at the quality of work done by the control group and the test group right so people who had did not have access to char GPD and people who had access to char GPT if I compare the quality of the work how does it change right and here is the result right so let me explain what's going on here um the quality of the work output across tasks has been rated on a scale of 1 to 8 which is on the horizontal axis and um the vertical axis is the distributions so the density or the distribution of works and the green graph shows the quality of work of people who used Ai and the blue group shows the quality of work of people who did not not use AI right so very clearly you see that the people who use CH GPT or AI in this case were able to produce higher quality output than people who did not use AI right um now there was also a third group the test group was further split into two groups where the green group was actually given absolutely no training uh so the kind of session that you went through on prompt engineering was not given to them but the red group uh which you see here uh was given some training on how to use chart GPT well perhaps a few sessions on prompt engineering and the red group shows that those guys perform slightly even better uh when it comes to using Ai and the quality of their output was better than the quality slightly better than the quality of uh the people who were given access to AI without training right from a statistical perspective or a numeric perspective um the Consultants using AI finished the tasks uh finished more tasks 12% tasks more on average they completed tasks quickly and produced higher quality right so I'll leave the slide on if you have any questions um I'm happy to take it the question from sasna is how was the quality measured SL Quantified uh the research paper has more details sna but basically what they did was they actually took the results to customers the actual customer the shoe company and have them evaluate the results and they used that data as the ground truth when evaluating the performance of the two groups so uh since it was an actual Consulting project they actually took the uh feedback back back to uh the customer okay okay any other questions before we move on so that's the first Insight that came from a this recent study right the second Insight that came was actually even more interesting which is that AI is a skill leveler right what does that mean it is actually and this is again a counterintuitive result but very interesting one um the idea was that that the AI was able to help the people who are not able to to do the job much more than it was able to help people who are actually proficient at doing the work right so the way to think about it is that AI is a great skill leveler right so the idea was that Consultants who were not a able to perform a task and or were in the bottom half of how well they produced in task their performance improved by 43% which is what you are seeing on the left graph at the bottom right so if you was to split the group each group into top performance and bottom performance then introducing AI to the bottom half performers increased their quality of their output by 43% and on the top half of the performance it helped them also but it improved their performance by 17% arguably because they were already starting at a higher level of performance so you could argue that the scope of improvement was low which is a fair argument but this has a lot of implications on the how human resources are managed hired and grown uh the the idea that the AI as an aing aid or a tool can be very helpful to people who are not that proficient with a particular skill has great insights on how organizations are maintained how hiring is done how upskilling is done and how they grow inside the organization and and the paper had a good analogy uh and the analogy was that there is a set of tasks um where performance may not matter as much right and the example they used was of miners digging through uh a mine and they said initially in the 19th century uh people who could mine well were very valuable but once you had a machine which could do the mining then the difference in digging ability did not matter anymore and that's a good way of kind of of mapping this as to uh how we think about skill uh and the jump and Improvement that tool like AI can provide folks who are not that proficient in a particular skill can have significant impact on which skills are valued going forward so that was a second Insight from the paper right that AI is a skill leveler right the the cautionary notes is now those are the two big insights but now they also cautioned on a few things the first thing that they said was that it is not always clear which skills fit this Paradigm of AI being able to be helpful to them right so there is still knowledge that is growing in the community that what kind of skills uh will be greatly aided with the help of an AI tool right and and that's partly because we are still understanding the applications of these large language models and this is a picture that I found very interesting um which basically said that on some tasks AI is immensely powerful and on other tasks it is it performs very badly or horribly right and it makes bad mistakes mistakes that any human would not make and this picture kind of explains what they are trying to say what they trying to say is that if you think about the capability landscape of AI it is not a circle as if that as the AI capability increases that Circle will grow uniformly and all tasks uh will be doable or helpable by Ai and other tasks would not be rather the right analogy is that this curve of AI capabilities is like the blue curve so in some places AI can help a lot and in other places it can actually be counterproductive to use AI in fact I've not shown the diagram here but in the paper they talk about certain skills in which using AI actually reduces the performance of the team so the the test group which has access to AI actually does worse than the group uh which did not have access to a right there are a couple of example of of those kind of tasks but for the vast majority of the 18 tasks the the the test group which had access to AI outperformed the group which did not have access to a but that is not a universally acceptable statement so you have to be careful about how you use it and there is no instruction manual there is something that most people are kind of figuring it out by starting to use AI uh if you use it for a few months you get a fairly good handle of when to use it and when not to use it um at the same time um right we talked about this uh this is so I'll skip this we already talked about this now the other interesting thing that I wanted to talk about is and there again a lot of debate on which Fields will it impact um and this one was an interesting data um study again which basically said the which listed the occupations which will be most exposed by Ai and this can again have a lot of impact on how you plan your business how you plan your business operations inside your Enterprise uh and where you may have to rethink uh how that work gets done does it get replaced by AI does it get automated or does it not get touched or does it get some sort of a hybrid approach um so this list is kind of a list of the occupations which are most likely to be impacted uh because of AI developing and the roles that will be impacted by AI right as you can expect there's a wide impact expected across multiple fields and domains right U and and more concretely what are the uh different Industries which are susceptible to a lot of AI coming in and those Industries get impacted uh There is almost now a lot of consensus emerging that Fields like legal finance and business operations uh are going to be significantly impacted in terms of Automation and AI coming in and doing a lot of the work that was being done right uh so those are those are again expected this was an interesting statistic that I found that the magnitude of impact on an industry is positively correlated with the median wage of the industry right and this is why this is such a uh such a news grabbing headline if you think think about the indust Industrial Revolution I think we briefly talked about it in the first lecture the Industrial Revolution the most impact was on blue color work right but if you look at that this line this kind of captures that sentiment that the industries which are getting impacted are the one which are the high paying Industries right where the median wage of the industry is high so the higher the median wage in the industry the higher the magnitude of impact on the industry is was an interesting way of kind of thinking about where AI will have most impact right um now coming back to from status macro to the micro picture there are Fields where AI is impacting right now the labor economic and the labor market right fields for example like graphic design um where AI is about 100,000 times cheaper and almost 3,000 times faster right the example is graphic design and image creation the cost the computational cost of create create an image using models available today is less than 1 cent right1 Cent and it takes about a second to create it um and if you would have tried to go to a graphic designer and get something like this done uh this would have cost about $100 and an hour at least so you can see 100,000 times cheaper and 300 3,000 plus times faster um I don't even think there's a debate in any organization that these kind of uh tools must be iMed mediately adopted because the impact on the bottom line is going to be significant so some of these are no-brainer sowb brainers right and you see this in multiple places um here are a lot of other places where this is happening right fashion is another Big Industry that's going to get impacted for example picture on the right right none of these models are real they are all generated by uh image generators and think about the impact on modeling photography customer support and so on and so forth right so these are all high impact areas happening today even creativity as we talked about in the first slide um in the first lecture right uh coming to this the this this is uh this is another interesting study this this is from Goldman Sachs right and the Goldman Sachs studies basically talks about uh whether AI is going to displace workers or augment workers and this is a slide that you can look at in more detail uh but they have kind of tried to say that there are Industries where AI will completely replace humans complete Automation and on the right hand side Industries which will not be impacted at all and then in the middle there will be a human plus AI impact right so so uh I will end this on a slightly positive note uh saying that saying that we have a we have a clear Trend emerging that on a large majority of domains right you will see the trend that it will be a the human machine partnership where the machine will go up to a certain extent and then humans will draw Insight from it uh and you can use that insight to do even more creative stuff right that's the place I will leave it uh just to summarize what we talked about today was the economics of AI from a computational perspective from a data perspective from algorithm perspective and from a human capital perspective so I'll stop there if you have any questions I'm happy to take them now okay question from Deepak on one side we see Capital cost is very high on the other hand we see it is much cheaper is it because of economies of scale how business model is working um I think deak you will have to ask the question I'm not sure I understood the question maybe you can unmute yourself and ask the question everybody yes sir uh what we saw is like you know implementing an AI model could be expensive if we really try to make an accurate model the first section now in this section what we saw is that in some of the places the impact of AI that it is able to make things very very cheap like picture generation an example so how is it because of the economies of scale you are mixing up two issues the first part was on building models the the last three slid were about using a model that somebody else has built very different ideas right for example when I say it takes about 1 second and one sent to create an image I'm using that you I'm assuming you're using something like dolly or mid Journey that somebody else has built right whereas saying I will build a text to image generator that is very very expensive does that answer the question deep yes but then how is the economics working for these people who have made this model for a user he has not made a model so he gets it much cheaper but how is the revenue happening for the guy who has built this model he would have spent lot of money because he is Distributing that cost across millions of users who want to generate images so instead of 10,000 graphic designers there's one graphic designer doing the work of 10,000 graphic designers across the world all of them using text to image generators now got it so it's like economies of scale for him yes thanks sure okay Jessica is saying I have three questions Jessica why don't you unmute yourself and ask instead of that might be yes yeah thank you so uh Professor first that how are bards classified are these systems also called AI systems do they come under the pocket of AI okay so let's kind of go deeper on this a bot is a piece of software written which does not typically does not have a US user interface and is supposed to do and still able to continue a series of instructions right for example if you look at Fields like robotic process automation uh they create a lot of bots and Bots is a very generic term right which basically executes a series of instructions and is typically faceless or does not have a user interface uh a bot may or may not use an AI underneath right for example a bot can be hardcoded to carry out a series of instructions in which case will not be an AI system a bot can use an AI system and if it uses an AI system then it becomes uh an AI powered bot a chatbot for example theoretically you could write a chatbot without backing by an AI system the earliest chat Bots were basically backed by databases where there were a series of pre-recorded questions and a pre-recorded canned answers uh like FAQ and they looked at the question and picked the answer and spit them out it was still a chat B but was not powered by Ai and today's chat bots of course are much much more powerful and are backed by AI so just because it's a bot we cannot say whether it's AI powered or not understood yeah I think that also covers my second question the last one I had is that what are the cost considerations we should keep in mind when we are using sft like is that also computationally very expensive yeah so that's a good question um so first of all let's kind of draw a boundary between training a model which I've spent most of the time talking about today versus fine-tuning a model which typically happens after the model has been built and then somebody else or you yourself or another team is picking up that model and adapting the model to a particular domain that's what we say say supervised fine tuning and I think somebody asked this question was a good question that supervised fine tuning is also expensive but that's a tricky statement you have to say as compared to what because we are not talking about absolute model so you have to be careful in answering that question um if you have a okay so if you have three options on the table build a model from scratch a b take a exit in pre-train model and run a supervised finetuning algorithm B or C take a f take a pre-train model and build a retrieval augmented generation model C even with this information I would say there is no preal answer it will depend on your requirements in some cases where the model will primarily depend on internal data it may still be cheaper to build a model from scratch in in cases where your model will benefit from natural language understanding starting with a pre-trained model might be beneficial and further going whether you are fine-tuning the model to adapt to a knowledge base then retrieval augmented generation might be better but if you are trying to control the style of how the model is speaking then supervised fine tuning might be better understood so it's very use case dependent like yes understood thank you sure uh Jack is asking how could we retrain our existing staff to adopt AI if it is possible the training cost would be huge um well depends on what do you mean adopt AI right so if you look at the Boston Consulting Group and the Harvard Business Review study uh all they did was give their staff access to something like chat GPT right and and there a learning curve on tools like this for most white collar workers is actually very low they latch onto how to use these models very quickly the challenge in doing something like this is guarding against misuse of the tool and more importantly ensuring that the staff does not blindly trust what the chatbot is outputting and understands the limitations of of AI and to make sure that they uh double check the results of what the AI models are saying um and I think um training sessions are the only way to go whether the training sessions are face to face or whether they are pre-recorded videos or there are references on the web you have to invest in that res Skilling I don't expect the cost to be very high if your primary objective is to teach them how to use AI in their work I must also say that this is going to become easier as we expect the market to be flooded with AI specific tools to different Industries this might become easier now on the other s if you are going to build models yourself then you should not retrain your whole company to do this then you should retrain a small subset of your company a small set of employees who are actually from the tech team to upskill them or hire those people who come with these skills but that's a much smaller group rather than the companywide effort and that is a very very specialized requirement where you start to say that I will build a ml model or a AI model in house okay um Satish is asking regarding critical thinking um how good are llms is there any measure okay now the that's a difficult question um and I would say that it's almost bordering on philosophical uh how do you evaluate critical thinking in humans right I think that's a bigger question to ask and if you believe that IQ tests are a good measure of critical thinking then tools like the AI mods like gp4 have already been exposed to IQ tests and have performed in the 99 percentile so in that sense they are almost performing at human level but as you know critical thinking is a tricky subject um we know for example that The Logical capabilities of a lot of these Machine model learning and AI models and llms are limited especially if it comes to uh complex thinking or thinking which requires multi-step inferences uh it's not easy easier to help not easy for these models to go through this chain of thinking and the kind of prompting techniques that I briefly mentioned like Chain of Thought and tree of thought are being Tred to improve the critical thinking skills as far as measures are concerned my answer would be whatever measure you use to measure the critical thinking skills of humans uh I would say you should use the exact same measures to or metrics to measure the critical thinking skills of LMS okay any other questions okay if there are no other questions I want to spend a few minutes talking about um your assignment um I hope you guys have been working working on your assignment and have submitted them uh the next weekend which are the last two sessions will be where students groups will be making the presentations uh and the evaluation that we have done there let me speak uh a few minutes about how that evaluation will work uh we will basically be using partly peer evaluation for the group presentations right and the way it will work is each group will pitch a startup idea and every student in this cohort will be given virtual million dollars right and I will be holding on to a few million dollars on my in my pocket and each one each individual will behave as an investor and you will have the right to distribute that million dollar from your virtual million dollars in any of the 11 groups excluding your group in a way you see fit so wherever you think that the idea has potential you are you're free to allocate a part of your million dollars in that idea from that group right and when both the days presentations are over this data would be collected and the total money that your group is able to raise will determine your marks out of 30 in that particular assignment so we have molded it as a startup wec kind of an idea so that you have to convince not only me but your peer group that what you are proposing is a great idea uh and uh you are you are basically raising money for your startup using AI in your domain and your peers will evaluate you right so that's what we'll be doing next weekend uh tomorrow's class is a buffer or a spare class um if you guys want to get together with your groups to prepare for the next weekend feel free not to join the zoom call and you can set up your own meeting to prepare for the next weekend session uh I would be available during tomorrow's session if there is something that you want me to briefly repeat from any of the previous sessions uh please join this link that has been shared with you and depending on how many requests there are and how many people want me to cover or repeat a subpart or a subtopic from whatever I have covered during the last 14 sessions I would be happy to repeat it or answer any questions that you might have so we will not be covering any new material tomorrow uh you are free to work with your group in tomorrow's time slot in your separate meeting to prepare for the group presentations on second and third which is next weekend which will be the final two sessions or join the link as usual uh and if you have any questions happy to discuss them to the best of my ability or or if you want me to repeat a subtopic from any of the 14 sessions I'm happy to do that also Aron you have a question please go ahead yeah uh so thank you Professor so the presentation what we have planning for the next week so can I uh convince my group or anybody in our group can convince among themselves initially to pick up the topic on the AI so we have a AI business model canvas right so the where we have put across an a a company so where we are building a camp canvas for that can we pick up that for our upcoming presentation is my question Professor sorry I did not understand the question AR so the upcoming presentation is on the AI uh company right the upcoming presentation is as group is doing a startup yes in your domain using AI as a key enabler okay so we have a AI business model canvas right in our individual assignments that's correct yeah so can that be those ideas can it be consider few of the things sure absolutely as long what you choose to present in your group presentation where you are ping a startup idea if you want to pick up ideas from your individual assignments yeah absolutely free to do so but it should be with the consensus of your group members perfect thank you Professor sure suup you have a question uh yes I wanted to announce one more thing regarding the sequence uh in which the presentation will be done so what we planning to do is we will randomize the sequence just to make sure there is no bias or everybody gets an equal opportunity so we'll randomize the sequence of the group and we'll share it in tomorrow's session okay so uh suup can we do that exercise when do you want to do that exercise of randomization we can do it tomorrow live itself okay so then then then you are basically asking everybody to join for 10 minutes right yes so the initial 10 minutes everybody I request you to join the call for first 10 minutes at least so that uh I'll do it live I'll randomize the group live so we'll get a sequence that who which group will be going first which group will be going second so on and so forth sure so folks if you could join for the first 10 minutes we will do the group Sequence and then folks can drop off for their group group discussions and preparing for the group presentations and those who want to stay back for doubt clarification or they want subtopics to be repeated uh those guys can stay back uh Jessica go ahead please yeah so uh Professor anyone from the team can present or like if it's like four or five of us any one person can present or that's something we can decide among ourselves that's completely up to you your group uh we encourage each one of the group members to present but it is absolutely not required we will leave it to the group to decide okay okay Manoj go ahead please so yeah uh Professor I had two questions one Jessica asked so I'm clear on that uh the second one is uh about this uh uh startup idea right so the uh when we pitch this idea so the main focus should be around Ai and technology or should it be around the business and and valuation and other things like how we are going to generate Revenue like uh usually we see in these shock tank and that stuff so Mano think of it as you are trying to raise money for your startup from a VC right so you have to cover all aspects where you can convince your peer group to give you money for a startup idea uh you should use AI as an enabler that is the only requirement you cannot pitch a shark tank idea in the sense that if it has no AI component it has no place in this uh course so as long as you can whatever you need to do to convince your peer group to invest in your company you should do okay thank you okay uh Kevin go ahead please yeah so Prof U it follows then that if we are going to sell the idea uh we could also do a prototype right to catch everyone's attention and oh that's a great one if you can do a prototype I'm sure you will draw a lot of money from your peer group fantastic thank you so much okay Aron I'm assuming that you have no more questions your hand is still raised but I'm assuming that's a I can lower your hand same for you also right any other questions okay uh so uh we will close the session today and I'll see you folks tomorrow I request you to join for the first 10 minutes we we will do the sequencing of the group presentations and post that you can drop off or choose to stay on that's your call good day good night everyone good thank you so much everyone you bye bye thank you good night everyone

good morning good evening everyone we&#39;ll start in a few minutes very good morning good evening good morning good morning good evening everybody all right we&#39;ll get started it&#39;s 5 minutes above the joining time uh s but there are still few folks uh requesting to be moved to panelists if you could help with that that would be great yes sure I&#39;m promoting okay so um let me jump in and share my screen and then we&#39;ll get started it all right so um today we would be talking about the economics of AI um and as you will see this word economics of AI has multiple interpretations um we will talk about U at least three big Ideas actually Four Big Ideas um will U which have implications on how you should think about expenditure and where does the cost go when you&#39;re trying to adopt AI in your organization so in some sense that&#39;s the microeconomics view of AI uh as to what you should be thinking from a budget planning perspective um when you think about a as I said where do the costs go in building an AI model and why this is such an expensive field um to to enter into uh the second aspect would be the macroeconomics um of aous is impacting things at a global scale we&#39;ll talk a little bit about that um and then we&#39;ll also talk about the impact on um labor economics to some sense or rather the role of humans uh as it concerns and where does the how does what does that mean that&#39;s the other aspect of Economics Affair we look at that so those are three or four broad themes that we will look at right so let&#39;s jump in um the idea is to kind of give you the money picture today right um so if you kind of take a step back and think about what we have covered till date in our lectures um you would have gotten a fairly good feel of the fact that the performance of a machine learning or AI algorithm broadly depends on three big things right or three themes if you may uh the first one is computational power right how what is the computational power available to train the models right um this is measured in a unit called flops which is basically floating Point operations per second right uh which basically you can think of it as a measure of how many operations uh that you need to do to kind of how many operations you can do per second is a measure of how much computation power that you have so you will see that if you&#39;re especially if you&#39;re working with large data sets uh then you need access to computation power and that is one place that lead up money U computation power the other thing that it&#39;s a um money is data and of course the most obvious thing to think about is the size of the data that you have and that for sure is a significant contributor to the cost that you will end up bearing right but we&#39;ll also talk about diversity of data and this phenomena if you have not heard of it before is a very interesting phenomena called the longtail of data and that creates significant challenges as as well as ADD significant costs to when you&#39;re trying to build Ai and ml models so that is the second aspect where which impacts the cost of doing an AIML product or a solution and the third of course is the machine learning algorithms or machine learning models which C take eat up money and which cost money right and what exactly does that mean why have I kept computing algorithm separately uh you will see as we move along the session today right uh longtail of data is something that I&#39;ll talk about it&#39;s a data distribution if you are familiar with data distribution Sumer just like you have a normal law normal distribution there&#39;s something called a power law distribution so the mathematical name is the power law distribution the more popular science name is longtail and I will cover that when I talk about data today okay all right so so let&#39;s start with computation right what does it mean and what you need to be aware of when you think about the computational required to train AIML models now this of course is tricky to explain and measure because most of it is done inside private walls or inside Enterprise boundaries but uh for for a lot of models this there is a lot of information available in the public domain also so we can rely on that and that&#39;s what I would be doing I&#39;ll be relying on information available public domain public data for you to help you develop an appreciation of the economics of AI when it comes to the computational cost and of course the computation cost is predominantly spent when you&#39;re trying to train a model right remember if I if you recall training of a model is fundamentally the idea that you are looking you&#39;re giving this an Al giving an algorithm a lot of data and you&#39;re asking the algorithm to learn from that data and capture the learning of the data in a software object called a model machine learning model or AI model so a lot of uh expenditure or money gets spent in training a model which is probably the most expensive part of building a model right so training computation or the computation cost on training your model um or the expenditure that you spend on computation while training a model is one of the three fundamental factors that drive the capability of the system right when I say capability of the system I&#39;m talking about your AI or ml model how capable it is how powerful it is how much can it learn how well it can predict what is the size of data it can consume are all bucketed into systems capabilities right so the key idea is that the training computation is one of the key factors the other factors as we saw in the previous slide are the kind of algorithm that you use and the input data and the parameters used right now this is a graph um and let me zoom in and try to explain what&#39;s going on um in this particular graph right so what you see on the horizontal axis is of course time right we start 1950 and it&#39;s I can actually go to the more detailed version but first let me just look at this on the horizontal axis is time uh on the vertical axis is the number of flops right remember I talked about how you measure measure computation is the floating points operations per second so on the vertical axis is the computational spent on training a model and on the horizontal axis is time right um I will also draw your attention to the fact that the vertical axis is not linear scale right so a fixed size represents an order of multiple orders of magnitude U computation increase right so if you look at you go from 1 to 100,000 and from 100,000 to 10 billion flops and if you go down you go from one Petra flop to 0.01 Petra flop so the vertical axis is actually on a log scale and each one one of the Dots here is a public a ml model right uh and the color of the bubble represents if you look at the legend on the right hand side uh the the domain for which the machine learning model or the AI model was built right so you see things like if you start from the bottom computer vision speech processing recommendation system multimodal uh Lang language models games and drawing right so what you see is that if you look at all the way from the beginning of AI which is 1950s 1960s till about 2006 right you have typically got models which have been trained below one Petra flop right so that&#39;s one scale and you see kind of like a paradigm shift happening around 20072 2008 uh and I I&#39;ll run the simulation for you in a minute right uh and then you see a crowding of a bunch of models after 2008 2008 remember is also the time when deep learning paper gets the first deep learning paper gets published in the nature magazine which is a world standard scientific publication perhaps the top one out there and post that since 2008 in the last two decades OD uh you see a lot of models being trained with very high compet ation power right so the key message is that the computation which has been used to train the publicly available Ai and ml models U have seen an exponential rise uh in the last two decades or so I again emphasize that on the vertical axis you have the logarithmic scale one jump means one or multiple orders of magnitude so what seems like a linear relationship is actually a exponential relationship so there&#39;s been exponential growth in the amount of computational power that we are putting behind training and AI ml model right so that&#39;s the uh key thing let me also show you the source from which this is picked up uh it&#39;s a Fairly reliable source and uh something that you should be I would recommend that you use in your daily life also um I use it fairly regularly this is the website our world in data right uh and the one that we are talking about will be I will&#39;ll be using five or six charts here but this is the one that I wanted to show you right so this is the one that we have been talking about so on the horizontal axis is time and on the vertical axis is the computational cost or the computational power being put behind each one of these models so you can do interesting stuff with this simulation for example if you&#39;re interested only in looking at language models uh you can just select only the language models and you can see the language models start really U the 2008 Mark will kind of stay consistent that&#39;s where deep learning papers start getting picked up but you can look at if you&#39;re for example interested in vision and not in language U you will see that um this is what you can see with vision right and if you&#39;re of course uh interested in all of them then you can let&#39;s let&#39;s just look at one of them and you&#39;ll kind of get a flavor of how this field has evolved right so if you look at this in the initial few years you see once in a while once in five years models coming in then there 2008 you see a increase in model and then as you get earlier in the more recent times you see explosion of models all in the language Spain getting trained at 10 billion Peta flops right so this of course is uh computation being thrown at the same problem and more and more computation making models more and more powerful right so that&#39;s fundamentally what&#39;s happening um and that&#39;s what I wanted to kind of show you that the amount of computation that has been made available partly due to mood&#39;s law uh and competition becoming cheaper uh has been playing a very important role in AI models becoming powerful and this is a theme that you will see a lot of experts talk about that yes there has been growth and Improvement in algorithms but a large part of the models becoming powerful uh is a contribution of more and more powerful computers and computation being used to train these models and now this can be said in three or four different ways this is one way of seeing it where you are basically looking at the amount of competition thrown at a model I&#39;ll show you two other ways to look at to drive home the same point right but um the training competition is the first thing that you would be spending on now depending on the size of the model that you are training you may not require this uh but this is where the state-ofthe-art AIML models are so if you are even if you even considering uh training or building a llm model from scratch which is World standard the amount of competition that you will need is something that you should account for if you are if you have that kind of budget all power to you but this is something that you should be fairly aware of if you&#39;re going to build very very large models okay so that&#39;s one uh the second thing that I wanted to show you is that there is a reason that so much computation is being spent on these large models right one of the things that you will see is and I&#39;ll show you the proof of this uh is that the performance of these models does in fact get better uh as more and more competition is being thrown at them so people are not doing this for the fun of it you can actually measure the performance of a model uh and say that that there is a very strong correlation between the amount of computation that you throw at a model and its performance right so one of the things that you kind of need to figure out is okay how do I measure the performance of a model and in generative Ai and in large language models um you need the model to be able to do multiple good things right that&#39;s the nature of language models right so if you look at one of the most well established metric in generative AI especially for large language models uh it is this performance Benchmark called MML which stands for massive multitask language understanding right and what it is is it&#39;s it&#39;s a database of multiple choice questions right and these multiple choice questions are spread across many many subjects I think 57 diverse subjects and they cover a long a large part of human knowledge right and what you want to do is you want to basically say okay this time um uh I&#39;m going to put the training computation which is the amount of computation that I&#39;m throwing at my data on the horizontal axis so instead of time on the horizontal axis note that we have computation cost on the horizontal axis in the previous case we had computation on the vertical axis now we have the training computation and the horizontal axis and on the vertical axis we have the performance Knowledge Test which is basically what is the percentage score that you got across this diverse multiple choice questions which covers wide variety of ranges of human knowledge right uh and of course um each dot here represents a AIML model uh in this case it specifically represents a large language model uh the size of the circle is an interesting metric which measures the amount of data that was used to train that model now there are very lot of interesting things that are going on in this one single data and one single visualization so let me call out a few things that uh will make uh that that I think are very important to realize first is the axes itself right um now uh at some point if you go back to your stakeholders and you say that I need more money to build a more powerful system to build so that I can improve the performance of my model uh you will get this question of Roi return on investment that how much money are we spending in training our model in terms of computation cost and what is the return that we are getting right and this is this is a good way of capturing it where you have put the performance of your model on the vertical axis and on the horizontal axis you have put the amount of computation spent to train that moduel right and that way you can measure that is it worth throwing more computation at the model or not right the second thing I want you to realize is that As you move towards the right you also see the circles getting bigger right this is not a coincidence uh and this is one of the now well established themes that the amount of data that you use to train a model must be directly correlated with the amount of computation that you are using to train it let me say that again if you only increase the computation without increasing the data you will not be able to get the Improvement in performance vice versa if you have too much data then of course you will not be able to consume it and training the model if you don&#39;t have access to competition so the the fact that the circles are getting bigger As you move towards the right is not a coincidence uh and the amount of data that you have and the amount of computation that you throw at the model almost need to go hand inand right and of course the color of the Bubbles as the Legend on the right is showing you depends on who is the Builder of that particular model you will see some famous names you&#39;ll see open AI you&#39;ll see Meta you&#39;ll see Google hugging face Deep Mind Right these are all The Usual Suspects who have built these very large scale models on very large data sets Okay U let me just run this and then I will take the questions that uh we are raising um let&#39;s see this is the model why did it die okay um right this is the model the performance on right so let me kind of so this is the diagram in much more visual much more scaled in version zoomed in uh and I&#39;ll leave this on while I take the questions Rahul is asking does this mean that gp4 is high performance model however will also incur a high cost of compute after after training no R not this is training so so let&#39;s go one by one right um the high performance so this particular screen all so this okay so let&#39;s look at this particular screen this particular screen is basically saying that gp4 has been trained with a lot of competition on a very large data site and is one of the most well performing models on this Benchmark called mlu the fact that it has been trained with a lot of computation also means that a lot of money was spent on training it so that is the first part of your question the cost is already been spent by openi in training the model I&#39;m not really sure what you mean by cost of compute after training uh as we mentioned in the previous Slide the primary cost of training a model is the primary cost of building a model is training it so when we say that 10 billion flops or 12 billion or 15 billion flops have been spent on this model we are saying that is the amount of computation that has been used to train this model now after the model has been trained and if you are using it to make prediction which is what happening today the mod gp4 has already been trained so if you are invoking it with an API or with Char GPT then also yes there is a expenditure to it right there&#39;s an expenditure to it which is which you are which you can potentially monetize by charging a fees that is called inference cost but right now I&#39;m talking about training cost which usually tends to dominate the amount of money being spent on a model OKAY cin is asking in this chart it seems GPT 4 has better performance versus G Ultra despite Gemini being trained with Forex the paeta flops what is fundamentally different about jp4 that makes it outperform despite less training competition um okay so n these are all gp4 for example is a closed Source model so we don&#39;t know too much about what it how it works underneath the hood though we have a fairly good idea of what it is now to your question of what could potentially be making gp4 perform better one argument could be that the kind of data that gp4 has been trained on uh is more is richer or more diverse or more correct in some sense uh than Gemini so that&#39;s remember the amount of computation is only one of the dimensions the other two dimensions are the data sets and the algorithmic Innovations so at a 60,000 ft level the answer would be the data set and the algorithm that was used to build gp4 is perhaps better than what was used to build JP Ultra how exactly we are not privy to because gp4 is a closed Source model right any other questions before we move on yes uh Professor this is mano so if we have uh the model which is trained on lot of data like for example jp4 on uh 10 billion plus uh uh that we are talking sorry sorry up this is not data this is computation yeah computation okay so the model who is like heavy in in terms of data plus computation is it going to take more compute power when we use it after training like to host it what do you mean by model being heavy so that means like I have gp4 which is trained on on petabytes of data all over world data right wait wait wait let&#39;s go slowly um this light is this visualization is showing two things the size of the bubble is showing the amount of dat on that was used to train the model yeah I just zoom it and the horizontal axis is showing the amount of computation that was used to train this model there is a there is a third dimension which is the next slide that I&#39;m going to talk about which talks about the size of the model I not gone to that slide so if your question is about size of the model you hold on to the ne till the next visualization sure this visualization is about computational power used to train the model which is another proxy for how much money was used to train the model and the data that was fed to the model and how those two things are comparing with its performance on the mlu performance Benchmark okay yeah sure right thanks sure any other questions yeah okay there are no other questions then let&#39;s go back and let&#39;s continue right so this was about model performance and how model performance does in fact correlate with the amount of computation that you are using to train the model right so there is a justification for um throwing more and more competitional power ATA model right now let&#39;s move on to the next one this was the question that Mano was asking this one has got to do with the size of the model right so what do we mean by the size of the model now you folks have not studied uh machine learning and you Al so so this term is slightly tricky but let me kind of use an analogy um in one of the classes which I believe Dr MTI talk took he talked about neural networks and deep neural networks and how these current neural networks and deep learning networks are modeled after the human brain right um and there is this idea that human brain is made up of neurons and these neurons are connected to each other right uh and there are more than 100 trillion connections between the neurons in the human brain uh one good way of thinking about the size of the model is the number of connections between the artificial neurons right that metric is also called parameters the number of parameters that the model has of course I&#39;m offering a a one dimension of defining model size because I&#39;m assuming that you are working with deep learning models if you&#39;re working with decision trees and linear regressions then the definition of the model size changes but most Large Scale Models today are neural networks and deep learning networks so I&#39;m not very far away from the truth the size of the model for deep learning models can be measured by what are called the number number of parameters that the model has which is more or less equalent to the number of interconnections between the different neurons right so when somebody says I have a 8 billion model parameter what 8 billion parameter model what it is saying is I have multiple maybe definitely lots of neurons but what what he&#39;s he or she is telling you is that the interconnections between those neurons are 8 billion right so that is a measure of model size right and um the fact that the model performance increases uh due to the uh because of the computational cost the amount of computation you are throwing at it is also reflected in model size and Mano hopefully this should answer your question so now we will expand our statement that I made that not only is the data size of the data set and the amount of computation that you use to train the model strongly correlated the size of the model is also correlated right which means that the larger the data set the larger the parameters that you need in your model and the more computational power that you need right so these all these three parameters move in sync um there are some papers which actually try to establish an equation on how these three models parameters relate with each other but but they are strongly correlated so the larger the data set you have the more the bigger the model you need the bigger the brain you need to understand that data and more computation power you need to learn from the data and when all these three go and sync the higher the performance of that model right so that&#39;s what basically the key message right here is the big picture um on the horizontal axis note that now we have put the number of parameters uh which is how big your brain or how big the brain of the model is which is measured in the number of interconnections between neurons and on the vertical AIS we have put um the uh computational Power which has been used to train these models again the vertical axis is not linear um it is on logarithmic scale so if you see a linear relationship here between these two parameters it is actually an exponential relationship uh but the point Point holds the fact that the bigger bubbles are on the right top tell you that the amount of data that you need the number of parameters that you need to have in your model to learn from that data set and the training computation that you need to throw at the model are all strongly correlated they are all correlated uh let me kind of switch slides here and show you the this diagram in mode this is interesting these are interesting diagrams so we it&#39;s worth spending uh time on this I think the model that I&#39;m looking for is this [Music] one yeah this one I already showed [Music] you where I yeah I think it&#39;s it&#39;s it&#39;s one of these models so you guys can look at it in more detail but um not able to find this particular I okay but but the point holds uh let me take the question and then we&#39;ll move ahead um there&#39;s a question from sesna saying is there a correlation between model training cost and model fine-tuning cost okay um good question I have not seen data which relates model finetuning cost to the model training cost uh strictly speaking my gut instinct would be that the model fine-tuning cost will be related to the size of the model and since the size of the model is related to the amount to the model training cost therefore the model fine-tuning cost will also be related to the model training cost so that&#39;s the way I would reason about it though I haven&#39;t seen data to prove or disprove my statement right um and this is the key message that by looking at these three graphs that I have covered so far that I want to you to take away that the size of the model the amount of computation that you throw at the model and the amount of data that you throw at the model should all be correlated you have to grow them all together if you grow only one or two of them without growing the third then you will not be able to get observation uh then you not be able to get the impact that you&#39;re looking for um Deepak is saying interesting observation in this diagram is that we see industry bubble is much larger visible even higher than academic does it mean most of the AI developments are happening in the industry great observation Deepa in fact I have four or five slides on this topic in this presentation itself so hold on to that thought process and I will have more to say about it that is on the macroeconomic side of what is happening with AI right this is on the microeconomics side right but that&#39;s coming it&#39;s coming in less than three slides um of course the last point before we move on from the computational cost is that we expect this trend to continue right for those of you who are familiar with something called moods law moods law basically says that the cost of computation continues to fall which means the $1 today um buys much more computation power than $1 18 months ago um in fact it says that $1 today buys twice the computation cost than it bought 18 months ago so every 18 months ago the cost of computation Hales because of advances in semiconductors manufacturing right so this trend uh has been continuing and it has continued much longer than anyone expected and it&#39;s still continuing so as the cost of computation continues to fall you can expect to see much more Improvement um in AI models right uh so this is a this again is a interesting diagram on the horizontal axis is time the date of release of gpus gpus are the chips which and they have been in the recent news in the last month gpus have been the news quite a bit um and this diagram has interesting things to see I&#39;ll just zoom in on a minute but first let&#39;s recognize that the horizontal access is time and the vertical axis is computation cost per dollar so the cost of computation how much computation can you buy for $1 right uh and as you will see that uh it has been growing significantly uh the vertical axis is linear the horizontal axis is also linear but um there&#39;s another interesting observation that this diagram brings up so let me zoom in here and then I&#39;ll call out why this is an interesting diagram right um let&#39;s zoom in here okay all right so the first thing the first thing to note here is of course as I said that the uh there&#39;s this this continues to grow right the cost of computation continues to fall or the computation Plus cost computation product per unit dollar continues to increase right now if you look at the manufacturer you see something interesting uh as all of us know uh Nvidia is the dominant player in this space you will see that they have been one of the key drivers of gpus and driving gpus of the chipsets as I mentioned to try to train today&#39;s Ai and deep learning models right um now if you don&#39;t if you look at Intel for example there&#39;s very little work that Intel has done in the GPU space and that&#39;s why they have been kind of missing the boat on the gpus and the closest computor to Intel is actually AMD who have done quite a bit of work they are not as news grabbing as Nvidia Nvidia is of course the market leader but the close following up or competitor of Nvidia is AMD right and that&#39;s what&#39;s happening on the chip side on the hardware side right um there&#39;s one question from Robbie what should be the deciding factor and guiding principle to determine on premises versus on cloud um okay so I assume that you&#39;re talking about the hardware infrastructure required to train deep learning models uh in general I think I think um there are several studies on this uh but broadly speaking you start with the cloud but once you have the size of a model Beyond a certain point if your model becomes too big uh and or if you have to train your model frequently enough um it starts to make sense to do it in house now this is easier said than done for two or three reasons one of course is that you need the human Talent to be able to train these models on Prem it&#39;s non-trial to build the on Prem data center with gpus where you can train these models so that&#39;s the human Talent you need to do this and the second is uh at least in the last 18 months or 24 months there has been a mad rush um and today right now one of the reasons Nvidia is doing so well is that their the demand for their GPU chipsets far exceeds the supply so there is a waiting time to buying gpus um and so even if you decide to do it in house it&#39;s not as if you can buy it off the shelf you&#39;ll have to place an order and wait a few months before you can get your hands on the gpus using which you can build this in house okay um any other questions before I move on okay right so um so this basically as I said the same message right so now that was on what kind of things that you should keep in mind when you are thinking about budgetary planning uh and this this that discussion is very very relevant if you&#39;re going to build very large models for smaller models of course the same principles hold but the dollar numbers will be smaller right but the principles of computation cost training data and model size being correlated with each other is is a generic statement which holds independent of model size that statement is in general true uh but the dollar numbers of course become much more significant as we as we increase the size of the model and the training data right now um the second part of the presentation I want to spend some time on today is something that deep pointed out right now um and this may or may not be relevant to you in the Enterprise directly but it is important important to know especially if you are going to be at a leadership role in the industry right um and and the the first thing that Deepa pointed out is exactly what I&#39;m going to start with right that there is a move out from Academia into industry right so if you look at the history of the topmost AI models being built and say where are these models coming from are they coming from universities which are shown in green or are they coming from industry or are they coming from a collaboration of the two which is shown in dark blue you see a very interesting Trend right the trend starts somewhere around 2013 2014 and you start to see that the number of models coming from industry uh has continued to grow at the cost of the number of models coming from Academia right so that is in fact a very valid observation that most state-of-the-art models are coming now from the industry um it&#39;s a good question to ask why this is so now one potential explanation is that the kind of capital required to build to buy this computational power is significant right remember the computational power the kind of computational power we are talking about requires money to purchase and a lot of industrial houses like open AI Microsoft um and Google are putting a lot of the money that they have or meta are putting a lot of the money that they have in actually building these models right and few universities in the world in fact have that kind of money to put behind building these large models right and that is kind of a important Trend to be aware of as to the source of these models now a lot of these models are open source but the point remains that the source of these models is happening from the industry and I and this idea of Industry versus Academia is something that again we&#39;ll see in three or four different ways um but this has major implications not only on the tech industry or not only in the industry as a whole but actually on geopolitics and the world economy right what this means is that AI models are increasingly under the control of a few major companies right uh and said when I say under the control uh in some cases they are absolutely under the control in which because they are Co close Source model gp4 is a good example it&#39;s a closed Source model you do not really know how it works um and uh you kind of know that it is a deep learning model you don&#39;t exactly know how it works and it is controlled by open Ai and Microsoft to some extent but even when the models are open right even when the models are open source like meta has released a bunch of these open source models the kind of comp computation power you need to run them you will end up going to one of these major large Cloud providers like Microsoft or Google or AWS maybe so the control of the large big Tech Giants is going to increase as a i becomes more powerful right um and U this has implications on the economics at the global scale um one argument is that it&#39;s going to be continue to be used for profit and uh as you will this is slightly dated slide but in 2024 we have already seen this AI regulations have taken Center Stage because once you have too much power aggregated in few players regulations STP in and try to make sure that there&#39;s no abuse of power uh this is one of the reasons that governments around the world um are waking up and saying we need to control this in fact in the last month or so we have we have seen discussions about something which is called Sovereign AI which is a play that Nvidia has put into um discussion and Nvidia is basically arguing that each country should own the AI models that they use uh and that&#39;s that&#39;s interesting which basically is uh where basically they are talking about saying that the models need to be owned deployed and controlled at the country level uh so that the so that the Monopoly or the control of the large tech companies can be checked checked in right um the other Trend that you see I think this is to cortland&#39;s point is that even the human capital is now moving towards uh industry right and this is the graph which tells you the story so if you look at the horizontal axis it&#39;s time right uh we are moving from 1950 to uh 2022 and what we are showing here is uh the teams which have built the topmost models where did these teams res side right so you&#39;re seeing Green in Academia and blue dark blue in industry and if you look at the dark blue the around 2014 again 2013 2014 2015 you will see that the dark blue becomes the dominating U color much more above green and light green and dark green again emphasizing that most of the human capital or expertise is now sitting in the industry right um the research Publications is uh the research Publications are increasing from the industry the human capital is moving and again this emphasizes the message that AI is going to be controlled by the tech companies and regulations have to be carefully thought about right um another interesting trend is that the number of people being hired by industry versus Academia right so you will see here that on the horizontal axis is time and the vertical axis is number of phds in AI staying in Academia versus going into industry and you will see again that around 2015 we see this trend where we have um the industry starting to absorb many more phds um often at the cost of Academy but Academy more or less has stayed flat but more and more phds who are graduating are moving towards the industry and that&#39;s kind of goes to the cotland Viewpoint of saying that the they are paying much bigger salaries right and all the messages get emphasized again right that it&#39;s going to the same set of half a dozen or so companies which are building these models okay um uh whether you look at Investments you will see the same Trend right the amount of investment being made corporate investment being made whether it&#39;s in acquiring companies or whether it&#39;s in private investment from VCS going in or in small small Stakes uh there is a growing Trend uh in more and more money in billions of dollars being put into industry so this trend is going to continue to grow right now um there is the last aspect of it that I briefly want to mention uh which I thought was very interesting that we talked about Nvidia versus AMD versus Intel as a supplier of GPU which is a corporate way of looking at who is selling these right but there is also a geographical view of looking at this uh and this is very of very rarely discussed in in media but at your level of seniority you should be aware of this that there is a geopolitical angle in this that the production of these gpus is limited to a very few countries actual manufacturing right now there is if you look at design most of these things get designed in the us as you would expect almost more than 60% it is in design is dominated by us but when it comes to fabrication of these chips then you will see Taiwan becoming the dominant country where GPU chips are fabricated assembled tested and packaged um and the fact that Taiwan has become so important uh geopolitically is of course there are many reasons for it but now I believe this is this is definitely one of the contributing factors that GPU chips and Powerful CPU chips Taiwan captures more than half the market in the world right so between Taiwan and us you have virtual control as you would expect China and South Korea follow close behind and then there is the long tail of a few countries participating so that tells you a lot about the geopolitical impact of AI and um the supply chain impact of AI right I mean if you are if you kind of are able to coordin off and cause supply chain disruptions um then you can actually impact the progress of your competitor country in in AI in the space of air so there are many many implications of that some of them not correct to be discussed in a public forum but um you can make your own judgments about what&#39;s going on and something that you should be aware of um parag is saying probably that is the reason behind offering DBA versus PhD in the course that the industry are generating more research paper in a space due to vast data set ability that can okay all right um let&#39;s move on okay so that was the first part that I wanted to talk about which was computation right and the economics of AI from the perspective of computational power data size and uh model parameter right um the second thing that I want to focus on is the economics of data and how or the how how data impacts the economics of AI right um and this is also very interesting topic so let&#39;s start um you must have heard the term economies of scale right what we see with data in some sense is what we call dis economies of scale right so what does that mean right now let&#39;s start very small uh and start at the beginning as you would imagine uh anytime you have to build or create a data set there&#39;s a cost to it right so there&#39;s a cost to collecting data there is a cost to processing data bringing it in certain form and there&#39;s a cost to maintaining data this this is obvious right now what is not so obvious is is that the way this the economics of this is actually fairly counterintuitive right so as you start to collect more and more data the marginal cost of collecting and maintaining data actually falls right so it gets cheaper and cheaper to collect process and maintain data per unit of data if you look at just volume so so this is economies of scale right which makes sense the more data you collect per unit cost or the marginal cost of an additional unit of data is not that much it actually falls what happens why do we call it the diseconomies of scale is the marginal benefit of adding a new data so the value of adding a new data point in your data set actually false right and both these statements are counterintuitive so I&#39;ll repeat them once more that as you continue to collect more and more data the cost of adding and storing one more data point reduces but the value of adding one more data point declines and the value of adding one more data point declines much faster than it become than the cost so the effective value of the data point is not there right I hope that kind of makes sense um I&#39;ll wait 30 seconds so that you understand what the statement means it&#39;s a very interesting statement so if you have questions ask me otherwise I&#39;ll move on you can read the line twice and hopefully it makes sense how do we say the cost of storing goes down the marginal cost which means what it stores to okay so here is the way to think about it what it takes to store the first megabyte of data let&#39;s say is $1 the cost of storing the second megabyte of data will be 90 cents per megabyte total will still be 1 .9 but the marginal cost of storing a unit of data Falls does that make sense yeah okay yeah so Deepak how the value is declining is coming in the next bullet point is there inative or quantitative way to determine this Cliff either tradeoff in training versus um yes and here is the thumb rule right here is why this is interesting and that&#39;s I&#39;m going to spend the whole next 45 minutes talking about this this is a very interesting topic the value of data if you quantify the value of data in how that data impacts the accuracy of a machine learning model or an AI model that is the sense in which I&#39;m measuring the value of data the reason the value Falls is because it&#39;s easy or it&#39;s relatively easy to build a model which gives you 75% accuracy let&#39;s say right it is much harder to build a model which in it is much harder to improve the model to increase its accuracy from 75% to 85% it is much much harder to increase it from 85 to 90 and it is an OD 10 degrees more harder to improve that from 90 to 93% so as your model becomes better and better and better and better increasing it that one step further takes much more effort right and if you say this in data terms you&#39;ll get to a point point where doubling the performance of your model initially might by throwing more data at it might be easy but after a certain point if you say now I want to double the performance of my model again you may actually need 10 times more data than you had in the previous step right so when I say the value of data I mean only in terms of the value of that data in improving the accuracy of your machine learning or AI model so the example that I gave that getting the first 70% 60% performance is easier but as you continue to make your model smarter and have more predictive power the amount of effort the amount of money the amount of data that you need has to be much much much much more which means that the marginal value of data is falling one megabyte of data is less valuable now because you need much more megabytes of data to get that same Improvement in model performance does that make sense so um Deepak your question is um I can&#39;t agree or disagree with this um see there is no way to say what is a good what is a good accuracy Benchmark there is such a strong Reliance on domain that if you have a model which can predict with 60% accuracy whether the whether the stock market will go up or down when it opens tomorrow that 60% is great you can make billions of dollar if you can beat 50% right because all you if the only thing that I&#39;m asking is tomorrow will it open higher or lower at the opening price 60% is great 60% is horrible if you are trying to do cancer detection right there you need 99% plus um so the percentage accuracy which is quote unquote acceptable great or bad is very very domain it um Robbie is asking how the value is defined in this case uh I think I answered it Robbie did you get the answer so my question professor in this case is if if we are saying the original value of the data is declined because the next model to be trained will require tons of data but then it will depend on the time availability and other factors to generate that data uh let&#39;s say for cancer detection versus the original intent of the data that we you know initiated the generating the model with so is value really subjective to the three things that you talked about earlier which is time computational and and availability or and and in that in that case this disomies of scale is implied or are we say the value is is is pretty much uh consistently declining because now the data required will be much more so first of all the previous section that we talked about was talking about computation cost model size and model performance um not about the cost of data right so I&#39;m not making any statement about how costly it is to produce data that is okay so let me this line the cost tends to decrease over time right that is a separate statement from saying that the marginal benefit so when I say value of data I should change my to say the marginal benefit of additional data points in your model is lower right instead of saying value of data which seems to disturb some of you the marginal benefit of an additional data point for your model will fall does that make sense right right yeah it makes sense okay yeah okay okay Nick has a long comment it is unknown how long it will take the model to train and how we and we don&#39;t know how long the model will remain relevant until it needs to be retrained the dis economies of scale makes the balance sheet a bit more complicated yes it does from a cost perspective it becomes difficult to do an Roi calculation uh how is this issue managed by VC groups looking for a reasonable return on their investment uh great question Nick um a16z which is one of the leading Venture capitalists in the world has a series of article on this they are referenced here at the bottom right um and if you&#39;re interested in this topic I encourage you to go and read a16 the articles they have written extensively about this and some of the material that I&#39;ve picked up is actually from their blogs but um it&#39;s a very very difficult problem for VCS um and what I&#39;m covering is probably only 30 40% of what all VCS need to worry about before putting in money in a AI company but valid point it&#39;s non-trivial problem and um they they look at various things right of course as you know if you if you&#39;re asking the question I&#39;m sure you&#39;re aware of the startup World VC&#39;s look at a bunch of stuff right they look at not only where you are but they look at whether you are aware of what you are getting into uh and my objective today is to at least expose you to that that you should be aware of what you are getting in into where will the cost go right as I said the cost will go in computation it will also go on data and I&#39;ll say much more about it over the next 30 minutes uh but data itself is a huge huge problem um uh and and there are no good mathematical formulas to come up to what you have asked nak it&#39;s it&#39;s just pointers and heuristics at this stage okay um Sati is asking is there any relationship between model parameters and data volume when data increases do we need to to increase the model parameters uh Sati yes if I go back to a couple of we covered this in the slide deck on the side in the section we were talking about computational power uh I think I showed you that the parameters of the model and the model size are strongly correlated and not only are these two correlated the amount of computation that you need to train a model with large data set and with large number of parameters is also higher so all these three parameters are highly correlated uh sashan is asking is there a way to calculate data asset value uh uh I mean I think it&#39;s the the answer is fairly obviously no uh because there is no way to even calculate the value of your model right so first you have to be able to quantify the value of your model and if you are able to do that then arguably you can calculate the data asset value by creating a relationship between the value of your model and the amount of data it has been used to train it and again I encourage you to go back and look at the um look at the charts in the first section that I covered and you&#39;ll get hints to this right for example your question to thean is the answer is one of the previous slides um that the the bigger bubbles actually let me show it to you um since and the these are very important charts uh they don&#39;t seem they seem fairly obvious but but if you think about them deep enough you will be able to get the um answer that the slide that I wanted to show to sasna yeah sasna this is the IM Manu you had a question okay so Das this is the slide which answers your question I will draw your attention to the fact that the vertical axis is the performance of the model and the size of the bubble is the training data size the fact that the largest bubble is at the top uh answers your question that larger data sets are more valuable than smaller data sets uh and if you were to create a different graph in which case the horizontal axis would have been the size of the data and the vertical axis would have been the performance of the model then you would be the slope of the line would give you the D data value does that answer your question yes Professor thank you so much yeah it does all right so let&#39;s move [Music] on okay so this is something we talked about we talked about industry versus Academia cost of capital we talked about geopolitical implications okay all right now the last thing that I want to bring in here which is perhaps the most complex part of this data discussion is that these dis economies of scale are made worse by the fact that a lot of real world data has a longtail distribution right now I&#39;ll try to explain this in the next 20 minutes uh it&#39;s a fairly unintuitive topic if you see it for the first time but it&#39;s important enough that I thought we should talk about it right here is what long tail looks like right so let&#39;s spend a minute understanding what this chart is uh this chart is about search U but as you will see in the next slide this chart appears in a lot of places in real world data sets right so what you are seeing here is on the horizontal axis you have the keywords that are used by people when they use a search engine okay let&#39;s say Google search and they are sorted in the order of most frequent words are plotted towards the L towards the left and the rarer words are on the right hand side of the horizontal axis and on the vertical axis you are plotting how many times these words were searched in a month right so if you look at the leftmost dark blue pillar you will see that the top 100 keywords are searched millions of times okay as soon as you get to the next top 500 keywords remember you have gone from 100 keywords to top 500 keywords now even though I&#39;ve increased the breadth of my pillar or bar graph the next 500 words together are searched only 100,000 times so right of there you see a significant fall in how popularity changes right so top 100 keywords millions of times top 500 keywords 100,000 times and then top 1,000 keywords only 10,000 times and then you see exponential fall right and then as you go towards the right you will see that the vast majority of keywords are searched a few 10 times maybe a dozen times right now this is a very very challenging problem because as you would imagine the bulk of the revenue of a search engine which comes from ads would come from the blue part of the graph because this is the common keywords which advertisers will buy or would like to advertise to so even though the bulk of the revenue comes from the blue part of the graph the blue part of the graph actually constitutes less than 30 % of all keywords 70% of keywords actually lie in the orange part which is something that the search engine is making a loss on but they must serve this 70% of the tail otherwise even the blue will go away right so even though the value of serving the long tail it&#39;s extremely expensive to serve the long tail you must serve the long tail to monetize the blue part of the graph right now as I said this is for search but this is a trend that is very very common in a lot of businesses right here is another example right um so 43% of Amazon sales come from the green part of the graph and 50 57% of Amazon sales come from the long tail right so almost equal split if you may 50 50 but the long tail is very very expensive because it&#39;s very very long the number of titles that Amazon must hold on to is very very large but 43% of the book sales will come from a very small subset of the books right and a lot of real world data sets follow this longtail distribution this is basically an indicator of how diverse human choice is right the problem is that this longtail becomes a big challenge for machine learning and AI models broadly speaking if your data set has a longtail distribution it is much much more expensive to build a machine learning model or AI model as compared to if your data set has a normal distribution which is the nice distribution longtail is very very challenging very very expensive here is another way of looking at it think about chatbots right so if you&#39;re building a chart bot using gen and the chatbot is supposed to answer only 100 questions you can build that chat part very cheaply but if you want it to be able to answer any question which is what chat GPT is where there&#39;s a long tail of what users are talking to chatbot about so the prompts follow the long tail then it is very very expensive to build right so this trend is that you will see this is what makes it very very hard to build good machine learning models and makes the dis economies of data even worse the fact that you have to solve for the whole thing otherwise you will not even be able to monetize the top thing right that&#39;s that&#39;s the part that data creates a challenge for right so the problem is as I said the machine learning or AI models they tend if you can build a machine if if you given a problem saying build a machine learning model which works on the blue part of the graph [Music] only that&#39;s a much easier problem right but if somebody says no you have your model should be able to serve the long tail also that&#39;s a much harder problem in fact if you look at the mathematical equation which describes this graph the horizontal axis goes on till Infinity it never goes down to zero right which means that the diversity of data that you have to serve is not practically infinite it&#39;s theoretically infinite right which means every time a new keyword or a new search or a new data point comes and you are expecting your model to perform well think about what that means for something that we discussed in the last session which is continuous [Music] learning right every time a new data point comes and you are still supposed to continue to perform well you must retrain the mod right and these edge cases will continue to emerge and these cannot be ignored because a they are missed customer opportunities B they can create a bad user experience and see they can that can actually lead to people losing out right and this really is at the heart of how software engineering is different from Ai and ML and why B why AI ml models go over budget right and often tend to go over time right software engineering if you come from the software engineering background software Eng in is a very well defined process where you have a requirement you build a product and you ship the product or you ship the project AI development is a continuous process right the there is the developer can pick an algorithm it can pick the data and it can throw the data on the algorithm and build a model but whether the model will perform well or not is a very difficult question because you are fundamentally asking a question about whether the model is able to describe and the real world and able to adapt with it right and very often as you know the real world is much more Messier uh and that creates a huge problem for a models because what you&#39;re asking them to do is to create a model which works in all kinds of scenarios right so this is if you are going to undertake an AIML project or a product keep in mind that the costs can get out of control very quickly the time can go out of what you had planned very quickly and it&#39;s really nobody&#39;s fault uh because the real world data is complex um it is longtailed or heavy tailed and to build good systems you must be able to solve not only for 80% of the customer but 95 to 99% of the customers which makes this a very very challenging problem right and this has basically led to a movement towards what people are now calling data Centric AI right what that means is that if you look at the trend till about 10 5 to 10 years ago if you look at the 221s a lot of focus was on which algorithm or which machine learning model will you use right and there was a lot lot of research both in Academia and in Industry on creating better algorithms or better models and the data sets were pretty much fixed what has happened in the last since 2012 2013 is that the focus has now changed to saying well we know that these are the best there there have some dominant models that have emerged a lot of them are in deep learning uh but other ones also are now fairly robust please established for example XG boost or random Forest so there are a dozen algorithms which are kind of now known to do a good job across all the models that have been explored in academic research and in Industry so the model research problem is less there I would say there&#39;s still a lot of work happening there but there&#39;s much more focus on the Enterprise or industry side on data Centric right so whereas the team or the human resources and the human Talent till in the 20 210s and even before that was focusing on the model the focus now is on engineering the data to build great systems because it&#39;s the model is kind of more or less a solved problem now or if it&#39;s it it&#39;s research is happening in a few circles but in the industry and the Enterprise elements it&#39;s mostly about how do you build great data pipelines great data engineering teams which can feed these models continuously right so the focus is on data centricity to improve both the quality and quantity of data for solving problems like long tail which I talked about in the previous slides uh making sure there are no duplicates making sure that the data is not biased right um we won&#39;t get into bias today but data biases itself even yesterday and day before yesterday there was a controversy with Google Gemini stol text to image generator uh which was which was impacted by bias in the data data set so these are the real problems that uh industry is now working on not the model itself right data Centric AI you need a definition basically breaks down into three sub themes one is developing the training data which is how do you collect and produce high quality Rich data to support the training of machine learning models uh how do you develop ways to evaluate your model right uh and this again is a tricky subject there are standard metric is like pration and recall but um it&#39;s depending on the domain you may want to develop specific test data sets on which you can test the performance of your model and of course data maintenance which is as data continues to grow how do you continue to ensure the quality and reliability of the data right um uh yeah so I&#39;ll skip this I&#39;ll move on here um I think it&#39;s 7:49 before instead of starting a next section uh let&#39;s take a 10 minute break let me just pause here and see if there any questions I can answer and then we&#39;ll go on the break sorry I can barely hear you can you speak closer to the microphone can you hear me now [Music] no hello hello hello yeah yeah this is better better yeah so the data Centric uh model is more like a domain specific like okay if we are restricting the domain uh then the data Cent will be more uh beneficial and cost effective to to create such model like like say for example the medical domain or any oil and gas domain we are restricting the data to a specific domain that means like it&#39;s we are making it more data Centric is it the right understanding no so what you are talking about is domain focused okay right domain focused and data Centric are two different ideas let me spend a minute explaining that data Centric says first whatever problem you are working on right and you might be working on a problem statement which is relevant to your company only or you might be working on a problem which is relevant to your industry or you may be working on a problem which is global irrespective of where you are working independent of where you are working the large majority of your focus would be on building a great data set so that you can train one of the well-known you can use one of the well-known algorithms to build a model so Focus your time and energy on collecting the right data set from which the model can learn and this statement holds independent of whether you are working on domain whether or whether you&#39;re working on a company specific domain specific or Global problem so that&#39;s statement holds independent of that does that make sense yeah any other questions yeah so sash is absolutely right and there was a slide in I think three or four weeks ago where I made the point that in today&#39;s AI an ml models 80% of the time is actually spent on data and training the model itself takes 10 to 15% of the time okay if there are no questions uh we&#39;ll take a pause it&#39;s 7:52 we&#39;ll take a 10 minutes break and we&#39;ll be back on 102 where we&#39;ll continue our discussion on um on the economics of AI we have talked about computation and data we&#39;ll talk about two other things today algorithms and models and then we&#39;ll talk about the human part of this or the human capital part of this so see you in 10 minutes all right good morning good evening everyone welcome back um we&#39;ll continue our discussion on the economics of AI um so what we have talked about till now in today&#39;s session the first part was about competition uh and we looked at how when we start talking about the amount of compute you need to train a model uh that discussion very quickly leads into the size of the model which is the number of parameters that the model has which then very quickly leads into the size of the data set that you have to train the model on and we saw how how these three things are correlated and if you increase all three then you can get higher model accuracy and we saw them both from a historical perspective Evolution over time and in terms of correlation with each other right so we saw that um we looked at was did you have a question you&#39;re you&#39;re unmuted okay uh the second thing that we looked at was how data impacts um the economics of here uh and we looked at of course that the data size is an important metric how big data is but it&#39;s perhaps a misleading metric uh uh the reason it is misleading is because of the way machine learning models performance is related with data size uh the marginal value of the marginal impact on the accuracy of a model for every additional data point is not linear right and it has a dis economy of scale and this problem is made worse because of real world data distributions which are longtail which makes the problem worse right now the third thing that we&#39;re going to talk about today is U the model part or the algorithmic part and what are the considerations and implications on the economics when it comes to this right and this is something that we need to Rec nice I had a previous slide on this but remember that once you have the data you have the compute kind of accounted for you then rely on AI developer a machine learning engineer a data scientist whatever you call it to explain the real world or predict the real world using a model right uh and this as we mentioned is a continuous effort we talked about it during continuous learning um and it is important to recognize that this is a Perpetual problem and not an easy problem because the world keeps changing the world is fairly complex and trying to model it using any kind of a model is a non-trivial problem right so uh call out that it is uh to recognize that this is a hard problem and likely to be something which stays complex um having said that the first first principle when you look at building a machine learning or AI model is to keep it simple right and that&#39;s what kiss stands for it&#39;s acronymic for keep it simple stupid which basically says keep the model as simple as possible if the model if the problem that you are trying to solve can be solved with a simpler model don&#39;t use very complex model right now there are several reasons why for why this is said the simpler models are interpretable which means that you can understand why they are making a prediction like that right uh they are reasonably easy to scale and they are cost effective right so the the first principle even though we said the world is very complex um it can be very computationally heavy to build large models and so on so forth the starting point of of your journey should be to start with simple models right now the reason that we don&#39;t jump into sophisticated models and there is a tendency among people who start in this field uh who who are starting off to immediately jump to say let&#39;s build a neural network or a deep learning model to solve this problem which is very tempting because it&#39;s cool and a shiny new toy but most experience people will start with saying start simple and see if you can make it work with the linear model uh then stick with the linear model because there are complexities some of the complexities are the sophisticated models or the large models are expensive to train they&#39;re expensive to maintain right there are situations where they can actually perform worse than simpler techniques uh this is for those of you who have studed a little bit of machine learning and data science this happens with something called overfitting right uh but if you have not heard about it but just keep in mind that just because you have a larger bigger more complex model does not always mean that your performance your model will outperform a simpler model that statement is not true right um another way of saying this is that more complex sophisticated V models tend to over parameterize on small data sets right they can also Pro produce what are called fragile models right fragile models are very interesting problem in in AI ml which means that these models 10 May do very well on the test data that you have but once you deploy in them in production their performance may fall down very very quickly right um and this problem is much more common with complex models large models like deep learning than it is with simpler models or linear models uh so again many many reasons to keep it simple uh having said that uh this is this guideline is exactly what it is it&#39;s a guideline uh it&#39;s not a rule right and that&#39;s what the picture on the slide is trying to show and there are two ways of looking at it the the simple way is to looking at at the top seaw diagram where basically uh it&#39;s saying that Simplicity and complexity uh are a trade-off and you need to hit the sweet spot for your data set right uh you can&#39;t keep your model too simple which is the Seesaw on the left hand side side and you can&#39;t make your model too complex which is the cesa on the right hand side you have to just make the right level of complexity for your data set which is somewhat of an art right and if you have worked on machine learning and data science the graphs below will make sense to you where you have a data point shown as a scatter plot and a linear model on the left panel is too simplistic and a 5 degree polinomial or a six degree polinomial which is on the right hand side panel is two comp indicated is an overfit and a quadratic model in the middle is just about right so you have to find the right complexity try to keep it as simple as possible as when it comes to algorithms and models is the first guideline okay Seline has a question she&#39;s asking if rack techniques are used and gen is only querying a fixed set of documents will there be longtail challenges are there any situations where longtail issu isues are not issues for J um so let me take the first question even though the set of documents is fixed Seline the users&#39;s queries are not fixed right you cannot control the behavior of what the user is asking uh the query set that the user might throw at your model will also have a longtail distribution uh even if the number of documents that you have for your rag is small so you can absolutely have a long tail of user queries even on rag based solution these are two different sides of the equation if you may a rack technique is on the supply side or data scientist side or the machine learning engineer side of the discussion as far as the user is concerned he or she doesn&#39;t really know whether you are using rag or deep learning or decision trees or linear models he or she will continue to behave as they behave and as long as there is a user interface the kind of queries that they throw at you can very likely follow a long TR um your second question is are there any situations where long issues are not issues for Gen okay so this in general is these negative proofs are very hard to handle um and the best I can say is I have not seen it does it mean that there is it&#39;s not possible I don&#39;t know but most gen Solutions will suffer from longtail issues primar because their interface tends to be a text based inter space so minute you expose a text based interface to a open public interface in front of the users their queries will tend to follow longtail distributions okay so dasna is asking technically can we say that whenever there is semi structure data un structure data there will be longtail issues um I&#39;m not sure whether longtail can be attributed to only semi structured or unstructured data and why it should not be attributed to structure data and counter example this would be Su of um the pro the Amazon example right if you look at the product purchase patterns of users which are structured data right because the portfolio even if it has 4 million products it&#39;s still structured data even there there will be a long tail behavior of customers in terms of what they buy so even in structur data you you will have longtail issues thanks Professor thank you um Jack is asking can you discuss fragile models further sure um The Fragile model idea is that the performance of a model just let me first Define what a fragile model is a fragile model is one that you have trained it on a certain data set and on that data set the model performs very well but when you deploy it in production then its performance Falls significantly we say that such models are fragile uh which is another way of saying that their performance is very strongly dependent on the data set distribution or the distribution of data on which the model is trained and even slight changes to the the distribution will lead to the model being fragile uh yo is saying fragile models implies overtraining I think you you mean overfitting uh yes uh the fragile models is a very strongly related concept to overfitting um and I won&#39;t go too much detail into that except for saying you that overfitting itself is a huge topic um but overfitting certain let me just finish by saying certain machine learning algorithmic families are more susceptible to overfitting than other algorithmic model families okay all right so so uh let&#39;s continue we&#39;re talking about the economics of machine learning models and AI models uh and uh what we started with by saying that please keep it simple do not jump into very complex models on day one it&#39;s neither desirable nor cost effective okay so that&#39;s the first principle to keep it as simple right the second principle is something that we have already talked about your life does not end when you deploy the model right uh you must continue to train your model right and uh this point about continuous learning which we spent a whole session on becomes even more important when the data set that is is coming to you is fairly heterogeneous which means that even though you trained your model on a small on a certain size of a data as more and more data comes in since the heterogenity of the data is very large your model might very quickly degrade in performance because the new data that is coming in is significantly different from the data that we trained the model on and this will happen with long TR dist data distributions so the idea that you have to continuously train your model uh becomes a very very important idea right and that&#39;s what the picture is showing here that on the long teril you can shift the performance of your model right by adding more training data or by adding more diverse data right from the long table so both of these are examples of how you can improve the performance of your model by moving the model by increasing the data set on which your model is trained uh by increasing it by moving that blue yellow divide that we saw in the Long training either by moving it right or by picking certain points from the long tail which might be more impactful from a business perspective you can improve the performance of your model right so the first point was keep it simple the second point was if you&#39;re going to be data Centric I pick data points from the long tail to improve the performance of your model um question from hansu I&#39;ve heard Tesco uses a real time cting model called rolling ball for product segmentation is that an example of such a continuous learning model I&#39;m sorry I&#39;m not aware of what rolling ball model is I&#39;ve not heard of it so I won&#39;t be able to comment on that okay um let&#39;s move on yeah uh there are other ways of uh make continuous learning uh you can fine-tune the architecture of your model or adjust the hyper parameters all those will be covered later uh the second part of this this the third thing when it comes to the economics of of the algorithmic or the modeling part of AI is actually a user is actually a user experience or a customer experience or a design Centric Focus right um and and this um this idea I definitely wanted to put in to make you realize that even though your focus might be on solving a problem of making sure that your model performs better a data Centric approach to AI expands the your scope to not only the model performance Improvement but looking at the overall Pipeline and saying where is the data coming from and where is the longtail of data coming from specifically and can we actually make cut short the long tail uh by making some optimization in the user experience itself right and one interesting example is um when you are looking at a search window the autoc complete feature right now in the absence of autoc complete and this is a case study I think somewhere I read this that before LinkedIn implemented autoc complete there were 177,000 different entities referring to IBM IBM capital capital I small B capital M International Business Machines all kinds of terms were used to refer to this entity called IBM right but by making an interesting tweak to the user experience by by adding auto complete they improved the complexity or they reduced the complexity of what the model had to recognize because auto complete automatically completed something like IB to IBM or International Business Machines all of them were internally mapped to the same entity uh so you can actually kind of relook at where the longtail of the data is coming from and can you make improvements in the user interface or the user experience to actually make get the user intent captured and shrink the long tail to actually make it a much shorter tail right which can have a huge impact on what your model can perform for right you&#39;re basically getting rid of the user error and saying that the large part of the long tail is actually user error right of course this is not true always but there&#39;s something to be aware of that it might be a fluke long tail coming from challenges in the user interface uh so that&#39;s something that helps um the third the fourth thing that I would say about um ml models andl models in practice and economics of this is that building a single ml model for your problem is not always the best thing to do very often it is actually better to break your data set or to split your data set into small cohorts and train one machine learning model or one AI model per cohort uh that approach is very widely used in the industry uh for multiple reasons right so one of the things that drives this decision is to ask yourself whether your data set on which you are building the model is it consistent or is it homogeneous right or do the use users behave similarly uh do all your users behave similarly right if your data set is homogeneous one model is a great idea and it is the default idea to stick with because again keep it simple one model to maintain one model to retrain and huge impacts for um in efficiencies however sometimes it is not advisable to build one model and as I said this is the dominant factor which determines this is how heterogeneous is your data right for example if the if you have very different behaviors across different customer sets or different regions different geographies or other kinds of ways you can split your customers in and one customer set or one region behaves very differently from another region or another customer set it make sense for you to build one model per cohort or one model per user segment uh this can this can not only improve the efficiency of each model but can also make sure that each model is simpler easier and cheaper to build right so that&#39;s something that you need to be uh very very cognizant of okay when is asking can we extrapolate this to a use a more narrow ml model for the head and torso and use a broader model for long tail um I&#39;m not sure I understand the word narrow and Broad narrow ml model and Broad ml model V do you want to elaborate on that phrasing yes Professor this uh goes back to the graph you had shown a couple of slides ago where you have like a chunky middle and a narrow head so that&#39;s that&#39;s that&#39;s on the data set that&#39;s a data distribution oh okay got it uh then please take that off I got call it wrong yeah uh so that&#39;s a data distribution and right now what I&#39;m saying is that you might want to build uh if your data is homogeneous and a longtail distribution by definition is heterogeneous right uh and this is what exactly the point that I&#39;m making that if you have a longtail distribution then you might want to build one model for each cohort and each model addresses a slice of your data right and that can really help improve the cost efficiencies of the economics of AI models the argument intuitively is that if you have different models or different behaviors to model or different complexities to solve for then the smaller problems are perhaps easier to uh solve for than one Global solution but of course building slicing your data set or slicing your customer base into cohorts uh brings in the requirement for domain expertise where a domain expert will tell you that this is the way to break down your customer set or data set into logical cohorts because they behave differently uh himansu is saying Ensemble models like random Forest uh no himansu this is not Ensemble modeling that is not what we are talking about uh here the in Ensemble modeling what you do is you keep the data set fixed and you train multiple models on the same data set and then combine the results this is in some sense a different approach where say break down your data set into run a clustering algorithm for example on your data set and for each cluster build a different machine learning model right so this is the exact opposite of ensembling where you are building a machine learning model for of course for each subset of your data set you can build another Ensemble but that&#39;s a different discussion right so that&#39;s what it is saying use clustering techniques uh to split your data set into logical cohorts or logical groups and then we&#39;ll build one model and this can be much more efficient so the example that I have here is for example you are building a bot detection algorithm now bot detection seems like one problem but it is in fact six different problems because there are different types of B for example the search crawlers from Google are Bots data scrapers which scrape data from your website are a different kind of Bot there are Bots which scan which attack the ports the HTP ports or the TCP boards so they&#39;re different kinds of Bo um Bots and they behave very very differently in fact they work at different layers of the um IP stack so building one single bot detection model is probably not the smart thing to do instead you should perhaps build six different B detection algorithms one for search crawlers one for code scanners and then try to on the incoming data have each one of these B detection algorithms run and if any one of them Flags it we call it that kind of a bot right so so you need to kind of have a system level approach of thinking about saying will one model really solve my problem or is the problem so complex and the data set to so diverse that I actually need to split my data set into homogeneous subgroups and then build one model per subset of the data right and this is as I said a very very common approach in production level systems whether it is for fraud detection or loan underwriting or content moderation U very often each one of these does not use a model it uses many many different models okay um this I will skip okay the the last part that I want to talk about today um is when we talk about the economics of AI we must address the issue of its impact on humans uh what we call labor economics or human capital right um and this is the thing that we will end with today uh a lot has been written about AI taking away human jobs and the role of humans in the age of AI uh it&#39;s an ongoing debate it&#39;s a rich area of research so I pulled together and some interesting studies that I had read uh and then we will finish with this I thought the one that I liked which was done most thoroughly uh was a study that came out of a collaboration of Howard business school and Boston Consulting Group right uh let&#39;s start here and this is a research paper the references at the bottom uh what this basically talks about is what they did was they actually uh took a fairly large data set about 750 odd consultants and these are all management Consultants working at BCG which is a Consulting Group uh I think based out of Boston if I&#39;m not mistaken and that&#39;s a reasonably large data set just from a percentage perspective this is about 7% of the task force of BCG so a good data set and what they did with this was that uh what they wanted to measure is that they had a control group and they had the test group so half of these 758 Consultants were given access to char GPT the professional version or the paid version which is GPT 4 based and the other half was not given access to chat GPT so you have a control group you have a test group and this group these two groups were given 18 tasks and these 18 tasks were actually not toy tasks but uh Consulting level tasks right this is what uh sorry somebody&#39;s asking a question about the previous slide let me just address it before I move on shant is asking so when we use multiple models trained on different data real time will the data sent to all the models and one of those model respond or will the data be filtered and sent to the right model well that really depends on how you implemented frean uh you can do either of those things in the first approach you can take a realtime data point and send it to all the models and ask each one of the models to make a prediction on that or you can apply your clustering algorithm on an incoming data point and say it falls in cluster three so I&#39;m going to send it only to model 3 uh there is no generic answer to that it really depends on how you will Implement that okay um any other questions on the previous slides before we we continue on the discussion of the role of human capital okay so let&#39;s continue as I was saying this study was done by Harvard Business School and Boston Consulting Group uh the the group on which the test was performed was 758 management Consultants uh and they were split into a control group and a test group the control group was not given access to chat GPT the professional version they were not given access to chat GPT period and the other half was given access to chat GPT right and the both the groups every member was given 18 tasks to perform and these tasks were split across different categories for example they were given creative tasks where they were asked to give 10 ideas for uh targeting a shoe at underserved Market or a sport uh analytical tasks where they were asked to segment the Footwear industry based on user profile they were given writing and marketing task where they were supposed to draft a press release for marketing uh they were given persuasiveness tasks where they were asked to write memos detailing a business case study kind of like what you guys are doing with um assignment to right that&#39;s a pursu pursu task um and what what they found is what I will talk about but before I go forward realize that all 18 tasks are actually fairly represent of what we call high value White Collar jobs today right and that reflects in the data set and what they were trying to see is what will be the impact of introducing AI to uh in a company like BCG right so that&#39;s what&#39;s going on so now let&#39;s look at the result this is the setup right so the key idea the Highlight was that AI improves productivity this is something that we have heard several times now but what is that mean right so here is what they said they looked at the quality of work done by the control group and the test group right so people who had did not have access to char GPD and people who had access to char GPT if I compare the quality of the work how does it change right and here is the result right so let me explain what&#39;s going on here um the quality of the work output across tasks has been rated on a scale of 1 to 8 which is on the horizontal axis and um the vertical axis is the distributions so the density or the distribution of works and the green graph shows the quality of work of people who used Ai and the blue group shows the quality of work of people who did not not use AI right so very clearly you see that the people who use CH GPT or AI in this case were able to produce higher quality output than people who did not use AI right um now there was also a third group the test group was further split into two groups where the green group was actually given absolutely no training uh so the kind of session that you went through on prompt engineering was not given to them but the red group uh which you see here uh was given some training on how to use chart GPT well perhaps a few sessions on prompt engineering and the red group shows that those guys perform slightly even better uh when it comes to using Ai and the quality of their output was better than the quality slightly better than the quality of uh the people who were given access to AI without training right from a statistical perspective or a numeric perspective um the Consultants using AI finished the tasks uh finished more tasks 12% tasks more on average they completed tasks quickly and produced higher quality right so I&#39;ll leave the slide on if you have any questions um I&#39;m happy to take it the question from sasna is how was the quality measured SL Quantified uh the research paper has more details sna but basically what they did was they actually took the results to customers the actual customer the shoe company and have them evaluate the results and they used that data as the ground truth when evaluating the performance of the two groups so uh since it was an actual Consulting project they actually took the uh feedback back back to uh the customer okay okay any other questions before we move on so that&#39;s the first Insight that came from a this recent study right the second Insight that came was actually even more interesting which is that AI is a skill leveler right what does that mean it is actually and this is again a counterintuitive result but very interesting one um the idea was that that the AI was able to help the people who are not able to to do the job much more than it was able to help people who are actually proficient at doing the work right so the way to think about it is that AI is a great skill leveler right so the idea was that Consultants who were not a able to perform a task and or were in the bottom half of how well they produced in task their performance improved by 43% which is what you are seeing on the left graph at the bottom right so if you was to split the group each group into top performance and bottom performance then introducing AI to the bottom half performers increased their quality of their output by 43% and on the top half of the performance it helped them also but it improved their performance by 17% arguably because they were already starting at a higher level of performance so you could argue that the scope of improvement was low which is a fair argument but this has a lot of implications on the how human resources are managed hired and grown uh the the idea that the AI as an aing aid or a tool can be very helpful to people who are not that proficient with a particular skill has great insights on how organizations are maintained how hiring is done how upskilling is done and how they grow inside the organization and and the paper had a good analogy uh and the analogy was that there is a set of tasks um where performance may not matter as much right and the example they used was of miners digging through uh a mine and they said initially in the 19th century uh people who could mine well were very valuable but once you had a machine which could do the mining then the difference in digging ability did not matter anymore and that&#39;s a good way of kind of of mapping this as to uh how we think about skill uh and the jump and Improvement that tool like AI can provide folks who are not that proficient in a particular skill can have significant impact on which skills are valued going forward so that was a second Insight from the paper right that AI is a skill leveler right the the cautionary notes is now those are the two big insights but now they also cautioned on a few things the first thing that they said was that it is not always clear which skills fit this Paradigm of AI being able to be helpful to them right so there is still knowledge that is growing in the community that what kind of skills uh will be greatly aided with the help of an AI tool right and and that&#39;s partly because we are still understanding the applications of these large language models and this is a picture that I found very interesting um which basically said that on some tasks AI is immensely powerful and on other tasks it is it performs very badly or horribly right and it makes bad mistakes mistakes that any human would not make and this picture kind of explains what they are trying to say what they trying to say is that if you think about the capability landscape of AI it is not a circle as if that as the AI capability increases that Circle will grow uniformly and all tasks uh will be doable or helpable by Ai and other tasks would not be rather the right analogy is that this curve of AI capabilities is like the blue curve so in some places AI can help a lot and in other places it can actually be counterproductive to use AI in fact I&#39;ve not shown the diagram here but in the paper they talk about certain skills in which using AI actually reduces the performance of the team so the the test group which has access to AI actually does worse than the group uh which did not have access to a right there are a couple of example of of those kind of tasks but for the vast majority of the 18 tasks the the the test group which had access to AI outperformed the group which did not have access to a but that is not a universally acceptable statement so you have to be careful about how you use it and there is no instruction manual there is something that most people are kind of figuring it out by starting to use AI uh if you use it for a few months you get a fairly good handle of when to use it and when not to use it um at the same time um right we talked about this uh this is so I&#39;ll skip this we already talked about this now the other interesting thing that I wanted to talk about is and there again a lot of debate on which Fields will it impact um and this one was an interesting data um study again which basically said the which listed the occupations which will be most exposed by Ai and this can again have a lot of impact on how you plan your business how you plan your business operations inside your Enterprise uh and where you may have to rethink uh how that work gets done does it get replaced by AI does it get automated or does it not get touched or does it get some sort of a hybrid approach um so this list is kind of a list of the occupations which are most likely to be impacted uh because of AI developing and the roles that will be impacted by AI right as you can expect there&#39;s a wide impact expected across multiple fields and domains right U and and more concretely what are the uh different Industries which are susceptible to a lot of AI coming in and those Industries get impacted uh There is almost now a lot of consensus emerging that Fields like legal finance and business operations uh are going to be significantly impacted in terms of Automation and AI coming in and doing a lot of the work that was being done right uh so those are those are again expected this was an interesting statistic that I found that the magnitude of impact on an industry is positively correlated with the median wage of the industry right and this is why this is such a uh such a news grabbing headline if you think think about the indust Industrial Revolution I think we briefly talked about it in the first lecture the Industrial Revolution the most impact was on blue color work right but if you look at that this line this kind of captures that sentiment that the industries which are getting impacted are the one which are the high paying Industries right where the median wage of the industry is high so the higher the median wage in the industry the higher the magnitude of impact on the industry is was an interesting way of kind of thinking about where AI will have most impact right um now coming back to from status macro to the micro picture there are Fields where AI is impacting right now the labor economic and the labor market right fields for example like graphic design um where AI is about 100,000 times cheaper and almost 3,000 times faster right the example is graphic design and image creation the cost the computational cost of create create an image using models available today is less than 1 cent right1 Cent and it takes about a second to create it um and if you would have tried to go to a graphic designer and get something like this done uh this would have cost about $100 and an hour at least so you can see 100,000 times cheaper and 300 3,000 plus times faster um I don&#39;t even think there&#39;s a debate in any organization that these kind of uh tools must be iMed mediately adopted because the impact on the bottom line is going to be significant so some of these are no-brainer sowb brainers right and you see this in multiple places um here are a lot of other places where this is happening right fashion is another Big Industry that&#39;s going to get impacted for example picture on the right right none of these models are real they are all generated by uh image generators and think about the impact on modeling photography customer support and so on and so forth right so these are all high impact areas happening today even creativity as we talked about in the first slide um in the first lecture right uh coming to this the this this is uh this is another interesting study this this is from Goldman Sachs right and the Goldman Sachs studies basically talks about uh whether AI is going to displace workers or augment workers and this is a slide that you can look at in more detail uh but they have kind of tried to say that there are Industries where AI will completely replace humans complete Automation and on the right hand side Industries which will not be impacted at all and then in the middle there will be a human plus AI impact right so so uh I will end this on a slightly positive note uh saying that saying that we have a we have a clear Trend emerging that on a large majority of domains right you will see the trend that it will be a the human machine partnership where the machine will go up to a certain extent and then humans will draw Insight from it uh and you can use that insight to do even more creative stuff right that&#39;s the place I will leave it uh just to summarize what we talked about today was the economics of AI from a computational perspective from a data perspective from algorithm perspective and from a human capital perspective so I&#39;ll stop there if you have any questions I&#39;m happy to take them now okay question from Deepak on one side we see Capital cost is very high on the other hand we see it is much cheaper is it because of economies of scale how business model is working um I think deak you will have to ask the question I&#39;m not sure I understood the question maybe you can unmute yourself and ask the question everybody yes sir uh what we saw is like you know implementing an AI model could be expensive if we really try to make an accurate model the first section now in this section what we saw is that in some of the places the impact of AI that it is able to make things very very cheap like picture generation an example so how is it because of the economies of scale you are mixing up two issues the first part was on building models the the last three slid were about using a model that somebody else has built very different ideas right for example when I say it takes about 1 second and one sent to create an image I&#39;m using that you I&#39;m assuming you&#39;re using something like dolly or mid Journey that somebody else has built right whereas saying I will build a text to image generator that is very very expensive does that answer the question deep yes but then how is the economics working for these people who have made this model for a user he has not made a model so he gets it much cheaper but how is the revenue happening for the guy who has built this model he would have spent lot of money because he is Distributing that cost across millions of users who want to generate images so instead of 10,000 graphic designers there&#39;s one graphic designer doing the work of 10,000 graphic designers across the world all of them using text to image generators now got it so it&#39;s like economies of scale for him yes thanks sure okay Jessica is saying I have three questions Jessica why don&#39;t you unmute yourself and ask instead of that might be yes yeah thank you so uh Professor first that how are bards classified are these systems also called AI systems do they come under the pocket of AI okay so let&#39;s kind of go deeper on this a bot is a piece of software written which does not typically does not have a US user interface and is supposed to do and still able to continue a series of instructions right for example if you look at Fields like robotic process automation uh they create a lot of bots and Bots is a very generic term right which basically executes a series of instructions and is typically faceless or does not have a user interface uh a bot may or may not use an AI underneath right for example a bot can be hardcoded to carry out a series of instructions in which case will not be an AI system a bot can use an AI system and if it uses an AI system then it becomes uh an AI powered bot a chatbot for example theoretically you could write a chatbot without backing by an AI system the earliest chat Bots were basically backed by databases where there were a series of pre-recorded questions and a pre-recorded canned answers uh like FAQ and they looked at the question and picked the answer and spit them out it was still a chat B but was not powered by Ai and today&#39;s chat bots of course are much much more powerful and are backed by AI so just because it&#39;s a bot we cannot say whether it&#39;s AI powered or not understood yeah I think that also covers my second question the last one I had is that what are the cost considerations we should keep in mind when we are using sft like is that also computationally very expensive yeah so that&#39;s a good question um so first of all let&#39;s kind of draw a boundary between training a model which I&#39;ve spent most of the time talking about today versus fine-tuning a model which typically happens after the model has been built and then somebody else or you yourself or another team is picking up that model and adapting the model to a particular domain that&#39;s what we say say supervised fine tuning and I think somebody asked this question was a good question that supervised fine tuning is also expensive but that&#39;s a tricky statement you have to say as compared to what because we are not talking about absolute model so you have to be careful in answering that question um if you have a okay so if you have three options on the table build a model from scratch a b take a exit in pre-train model and run a supervised finetuning algorithm B or C take a f take a pre-train model and build a retrieval augmented generation model C even with this information I would say there is no preal answer it will depend on your requirements in some cases where the model will primarily depend on internal data it may still be cheaper to build a model from scratch in in cases where your model will benefit from natural language understanding starting with a pre-trained model might be beneficial and further going whether you are fine-tuning the model to adapt to a knowledge base then retrieval augmented generation might be better but if you are trying to control the style of how the model is speaking then supervised fine tuning might be better understood so it&#39;s very use case dependent like yes understood thank you sure uh Jack is asking how could we retrain our existing staff to adopt AI if it is possible the training cost would be huge um well depends on what do you mean adopt AI right so if you look at the Boston Consulting Group and the Harvard Business Review study uh all they did was give their staff access to something like chat GPT right and and there a learning curve on tools like this for most white collar workers is actually very low they latch onto how to use these models very quickly the challenge in doing something like this is guarding against misuse of the tool and more importantly ensuring that the staff does not blindly trust what the chatbot is outputting and understands the limitations of of AI and to make sure that they uh double check the results of what the AI models are saying um and I think um training sessions are the only way to go whether the training sessions are face to face or whether they are pre-recorded videos or there are references on the web you have to invest in that res Skilling I don&#39;t expect the cost to be very high if your primary objective is to teach them how to use AI in their work I must also say that this is going to become easier as we expect the market to be flooded with AI specific tools to different Industries this might become easier now on the other s if you are going to build models yourself then you should not retrain your whole company to do this then you should retrain a small subset of your company a small set of employees who are actually from the tech team to upskill them or hire those people who come with these skills but that&#39;s a much smaller group rather than the companywide effort and that is a very very specialized requirement where you start to say that I will build a ml model or a AI model in house okay um Satish is asking regarding critical thinking um how good are llms is there any measure okay now the that&#39;s a difficult question um and I would say that it&#39;s almost bordering on philosophical uh how do you evaluate critical thinking in humans right I think that&#39;s a bigger question to ask and if you believe that IQ tests are a good measure of critical thinking then tools like the AI mods like gp4 have already been exposed to IQ tests and have performed in the 99 percentile so in that sense they are almost performing at human level but as you know critical thinking is a tricky subject um we know for example that The Logical capabilities of a lot of these Machine model learning and AI models and llms are limited especially if it comes to uh complex thinking or thinking which requires multi-step inferences uh it&#39;s not easy easier to help not easy for these models to go through this chain of thinking and the kind of prompting techniques that I briefly mentioned like Chain of Thought and tree of thought are being Tred to improve the critical thinking skills as far as measures are concerned my answer would be whatever measure you use to measure the critical thinking skills of humans uh I would say you should use the exact same measures to or metrics to measure the critical thinking skills of LMS okay any other questions okay if there are no other questions I want to spend a few minutes talking about um your assignment um I hope you guys have been working working on your assignment and have submitted them uh the next weekend which are the last two sessions will be where students groups will be making the presentations uh and the evaluation that we have done there let me speak uh a few minutes about how that evaluation will work uh we will basically be using partly peer evaluation for the group presentations right and the way it will work is each group will pitch a startup idea and every student in this cohort will be given virtual million dollars right and I will be holding on to a few million dollars on my in my pocket and each one each individual will behave as an investor and you will have the right to distribute that million dollar from your virtual million dollars in any of the 11 groups excluding your group in a way you see fit so wherever you think that the idea has potential you are you&#39;re free to allocate a part of your million dollars in that idea from that group right and when both the days presentations are over this data would be collected and the total money that your group is able to raise will determine your marks out of 30 in that particular assignment so we have molded it as a startup wec kind of an idea so that you have to convince not only me but your peer group that what you are proposing is a great idea uh and uh you are you are basically raising money for your startup using AI in your domain and your peers will evaluate you right so that&#39;s what we&#39;ll be doing next weekend uh tomorrow&#39;s class is a buffer or a spare class um if you guys want to get together with your groups to prepare for the next weekend feel free not to join the zoom call and you can set up your own meeting to prepare for the next weekend session uh I would be available during tomorrow&#39;s session if there is something that you want me to briefly repeat from any of the previous sessions uh please join this link that has been shared with you and depending on how many requests there are and how many people want me to cover or repeat a subpart or a subtopic from whatever I have covered during the last 14 sessions I would be happy to repeat it or answer any questions that you might have so we will not be covering any new material tomorrow uh you are free to work with your group in tomorrow&#39;s time slot in your separate meeting to prepare for the group presentations on second and third which is next weekend which will be the final two sessions or join the link as usual uh and if you have any questions happy to discuss them to the best of my ability or or if you want me to repeat a subtopic from any of the 14 sessions I&#39;m happy to do that also Aron you have a question please go ahead yeah uh so thank you Professor so the presentation what we have planning for the next week so can I uh convince my group or anybody in our group can convince among themselves initially to pick up the topic on the AI so we have a AI business model canvas right so the where we have put across an a a company so where we are building a camp canvas for that can we pick up that for our upcoming presentation is my question Professor sorry I did not understand the question AR so the upcoming presentation is on the AI uh company right the upcoming presentation is as group is doing a startup yes in your domain using AI as a key enabler okay so we have a AI business model canvas right in our individual assignments that&#39;s correct yeah so can that be those ideas can it be consider few of the things sure absolutely as long what you choose to present in your group presentation where you are ping a startup idea if you want to pick up ideas from your individual assignments yeah absolutely free to do so but it should be with the consensus of your group members perfect thank you Professor sure suup you have a question uh yes I wanted to announce one more thing regarding the sequence uh in which the presentation will be done so what we planning to do is we will randomize the sequence just to make sure there is no bias or everybody gets an equal opportunity so we&#39;ll randomize the sequence of the group and we&#39;ll share it in tomorrow&#39;s session okay so uh suup can we do that exercise when do you want to do that exercise of randomization we can do it tomorrow live itself okay so then then then you are basically asking everybody to join for 10 minutes right yes so the initial 10 minutes everybody I request you to join the call for first 10 minutes at least so that uh I&#39;ll do it live I&#39;ll randomize the group live so we&#39;ll get a sequence that who which group will be going first which group will be going second so on and so forth sure so folks if you could join for the first 10 minutes we will do the group Sequence and then folks can drop off for their group group discussions and preparing for the group presentations and those who want to stay back for doubt clarification or they want subtopics to be repeated uh those guys can stay back uh Jessica go ahead please yeah so uh Professor anyone from the team can present or like if it&#39;s like four or five of us any one person can present or that&#39;s something we can decide among ourselves that&#39;s completely up to you your group uh we encourage each one of the group members to present but it is absolutely not required we will leave it to the group to decide okay okay Manoj go ahead please so yeah uh Professor I had two questions one Jessica asked so I&#39;m clear on that uh the second one is uh about this uh uh startup idea right so the uh when we pitch this idea so the main focus should be around Ai and technology or should it be around the business and and valuation and other things like how we are going to generate Revenue like uh usually we see in these shock tank and that stuff so Mano think of it as you are trying to raise money for your startup from a VC right so you have to cover all aspects where you can convince your peer group to give you money for a startup idea uh you should use AI as an enabler that is the only requirement you cannot pitch a shark tank idea in the sense that if it has no AI component it has no place in this uh course so as long as you can whatever you need to do to convince your peer group to invest in your company you should do okay thank you okay uh Kevin go ahead please yeah so Prof U it follows then that if we are going to sell the idea uh we could also do a prototype right to catch everyone&#39;s attention and oh that&#39;s a great one if you can do a prototype I&#39;m sure you will draw a lot of money from your peer group fantastic thank you so much okay Aron I&#39;m assuming that you have no more questions your hand is still raised but I&#39;m assuming that&#39;s a I can lower your hand same for you also right any other questions okay uh so uh we will close the session today and I&#39;ll see you folks tomorrow I request you to join for the first 10 minutes we we will do the sequencing of the group presentations and post that you can drop off or choose to stay on that&#39;s your call good day good night everyone good thank you so much everyone you bye bye thank you good night everyone

Transcript for:Economics of AI Lecture Notes

Transcript for:
Economics of AI Lecture Notes