Transcript for:
Insights on Cloud Computing from Berkeley Faculty

I'm here with some members of the berkeley computer science faculty who have just produced a report on cloud computing what if you all introduce yourselves I'm Dave Patterson Randy Katz I'm Anthony Joseph and our Magnavox so a little bit about this paper what's it called its called above the clouds a Berkeley view of cloud computing and it's the result of work that a bunch of us have been brainstorming on for the last six months and so what is cloud computing really cloud computing is is the ability to do to migrate the computation that used to happen at the edges into the network and it gets realized through these large-scale internet data centers why is this happening now what what changed to make this an interesting topic why are you guys writing this today and not ten years ago we think this is something we should talk about now as opposed to ten years ago we tried to identify a number of factors there probably the one that kept coming up most often was the the the ability to offer this pay-as-you-go computing such as for example what Amazon is doing the fact that they can essentially allow anyone with a credit card to almost instantaneously get what appears to the user to be almost infinite resources on demand the only way that that could be provided economically is if you start out by having an extremely large data center and who can statistically multiplex all of these different users on it and arguably ten years ago the demand curve of the internet had not yet gotten to the point where we had not one but actually several major players who were building data centers out of the scale so Google certainly Amazon certainly Microsoft many others as well and the idea was if they already have all this capacity and if they need to develop the operational expertise to use that capacity internally and to be able to multiplex it across different applications and so on there's an opportunity to derive additional revenue essentially from productizing that so that's an opportunity it's really just in the last few years the scale has gotten big enough and also frankly the open-source software stack has gotten so rich and there's so many different building blocks that someone's starting out kind of from the bare Hardware can very quickly get lots of pieces of an application up and running based on open source and that was important as well don't you think it has something to do also with the ever increasing commoditization of the hardware definitely was the same base absolutely you know maybe 10 or 12 years ago it wasn't necessarily given that the intel architecture was going to really become the de facto standard for building out commodity data centers and of course the fact that you know with the rise of fast virtualization such as Xen and VMware provide means that you can sort of very efficiently slice up a single machine and essentially resell it to many people I think also if you look at the demand side of it the ability with cloud computing to be able to very rapidly scale up from small numbers of users to huge numbers of users is very attractive economically means I don't have to plan ahead for how large of demanding store I think I'm going to but rather I can scale incremental we've talked a lot about who might prove up who might consume cloud computing who might provide why would someone want to offer this service well that's a that's a very good question to some extent the the first generation of cloud operators need these technologies for their own websites they want to exploit elasticity in the way in which their own workloads grow and shrink over the course of the year the nature of the kind of websites that they provide may be very busy at the end of the year not so busy and January kind of thing given that they already have to install that kind of technology to support their own major website like for example an amazon.com why not try and exploit additional revenue opportunities by making it available to third parties when you yourself are not using it or that allows you as a as a website operator to engage in economies of scale by simply buying more resources that you know as your website grows you will eventually need but in the meantime getting revenue from it by allowing third parties to use it in a cloud computing environment there's also probably in many cases there are very good business case to be made for using your data center for cloud computing so for example Microsoft has built a number of large data centers because they operate a lot of large-scale online properties but of course because we all know they're also very successful in desktop software so there's kind of a natural opportunity there to enlarge a successful franchise if you could provide some kind of added value to your installed desktop software user base by allowing that software become more powerful when it's extended into the cloud for example you know you can imagine Excel doing extremely expensive computations back in the cloud or using the cloud to facilitate collaboration among people who are editing documents so you know another answer the question of who would go into the cloud provider business is you know there may be opportunities you can enlarge or defend the successful business by adding value to cloud computing yeah I have I think one more point there is that we think one of the advances to the question is why did this happen now is we think that these cloud providers are just calm era companies who started providing the services over the internet that survived became so popular they were forced to push the bounds of what people could build and they started building much larger data centers built other commodity hardware and I think I don't know if they would have done this on their own as a startup company but they were forced to get to this spot and by getting to the spot of data centers that were tens of thousands of servers that could serve millions of people they discovered that they were able to in a kind of per copy time you know per byte transferred or stored create a new breakthrough into how low the cost could be and then that man it was so much lower that for most people's data centers they could actually sell it at a profit is what lets what do this opportunity you're saying it's more efficient to have ten thousand machines than 100 machines yes what we're saying is you know I assume this comes back anonymous economies of scale that the claim is I don't know people claimed in advance that they were forced by their workload to build things instead of a hundred thousand servers that tens of thousands of servers and once constructed they recognized that they had huge economies of scale over the smaller clinics things so much the camera kind of movie said takers factors of five to seven difference and that's such a big difference from commodity community that plausibly they could sell their own servers cheaper than you can build it yourself I mean one key thing is that the providers are buying machines in batches of hundreds or thousands so they're deploying a very uniform homogeneous environment from a management and maintenance standpoint that dramatically reduces the cost should I be nervous if I put my data in the cloud that my cloud poster will lose it so you should be nervous wherever your data is concerned yeah I think you shouldn't rely on the cloud provider as your your sole backup of your primary data I think it's very important to have a disaster recovery plan that includes backups and other kinds of options yeah I think so Anthony makes a good point I think one of the other we quote Richard Stallman in our paper who is very he's very concerned about cloud computing basically becoming a quote unquote a proprietary software trap for users and you know one of the one of the scenarios that he envisions is once your data and applications are in the cloud if they're subject to you know proprietary software that's holding in place it essentially your data could be held captive so I think you know separately from the concerns of availability and backed enough and having redundant copies of your data is a question of from a business continuity standpoint if you need to get your data out for whatever reason how easy would it be to do that you know what would you end up incurring an additional unforeseen expense just to be able to move your data from one place to another I think this leads to you know another one of the big challenges in this area that each of the cloud operators has their own unique environment for developing applications and hosting applications and in fact one of the standard ways of dealing with mitigating the risk is to allow your application and your data to live amongst multiple providers but currently there is no easy way of making that happen it's completely up to the user and they man get more the application developer I should say and they will have to deal with the potential heterogeneity of the different interfaces across multiple cloud providers but in a sense your responsibility for making your data survive any problem with that data amongst one operator is to spread it amongst multiple operators including possibly yourself talking a lot about applications well it's so fundamentally change when we think about writing software or is it the case that I will still have this model if I write it on my desktop and then I run it on however many machines in the cloud well I think one thing that you want to consider when you're developing applications is instead of thinking about trying to necessarily make the serial performance of that application as fast as possible it's important to make that application parallelizable so if you can make it paralyze Abul then you can deploy it on the cloud and scale up paralyzed build this case meaning for scaling rather than the you know so much the multi-core kind of care lots of the part that the horizontal scalability would be the would be the virtue in cloud computing is that even you can buy and discard instances almost instantaneously you love your apps to be able to do that another besides the kind of the software environment is the enabling ability of cloud computing as we say in the paper we draw an analogy to the what happened the semiconductor industry is fabulous but more expensive it bifurcated the industry into people who designed chips that didn't have fab lines people have families with designed chips we think that thinking could happen here anybody who develops software now could have their own data center to give cloud computing so it could inspire more people to the error maybe even more attractive to software as a service than independence today any final thoughts I think we're why we write this circuit we've got this paper because we think this is a big deal we think cloud computing is going to transform the IT industry software and hardware we think as a result of this people are going to remain this industry should be rethinking probably do applications how to do the infrastructure software how to do part or we think the you notice I'm sure this will take five or ten years but five or ten years we'll look back at you know 2008 2009 is a milestone in the industry and it'll look quite a bit different afterwards and if you're going to be working in this industry you should become aware of it and make plans accordingly I think it's it's sort of a kind of paradigm shift that is it happens every few years in the information technology platform arena that's it's as important as we move forward into the 21st century as client-server computing was in the 80s or the initial wave of internet style computing web access and so on in the in the 90s this is kind of the platform for the next decade I would echo what David and Randy said and also add that you know we also wanted to provide a clear concise terminology before all of the concept five computers with a lot of discussion as to what cloud computing is and isn't and we wanted to just try and provide one set of ground truth before we think about computing this we also wanted to highlight that the future is not completely rosy for cloud computing but that there are significant challenges that require both technical and policy and other kinds of analysis and research before they can compress yeah I think one thing that really never fails to amaze me about this field is the ability when when a confluence of tech trends such as we were seeing for cloud computing makes possible a new way of solving a problem a new way of building a system it's amazing once it gained some traction how quick they can adapt you know at Berkeley in the mid-90s I had the pleasure to be involved actually with Randy and Dave and the network of workstations project and at that time it was by no means a done deal that very large scale millions of user applications were gonna be built out of commodity clusters that was not you know nobody really took that for granted in the mid-90s and yet once it gains in traction it really swept away everything in its path before I mean that is the way that Internet services are built now out of large clusters of commodity components and yes you know it meant that software had to adapt we had to you know think differently about how scaling and partial failure models were handled but the opportunity was so great that people rose to those technical challenges and I think cloud computing has the potential to be a similarly seismic event in that way so we're excited