Transcript for:
Challenges in Online Streaming Infrastructure

let's just return to the infrastructure of the platform of Netflix and speak more generally Netflix Twitch YouTube like anytime I use any of these services I'm just blown away by the the infrastructure it takes to deliver this service youtube and Twitch are unique versus Netflix where the creators can roll in themselves and upload stuff so on the consumption side YouTube has over 100 billion views a day over 1 billion hours watch time but on the sort of creator side 1 million hours of videos are uploaded every day 1 million hours it's like you have to do you have to service both and you have to deliver everything it's incredible to me uh can you maybe speak to your own intuition just zooming out on it what it takes to deliver that kind of infrastructure for me the thing that I I find vastly complicated and I can't imagine the engineering hours is how do you even create an edge in that situation and what I mean by an edge I mean like when people say this phrase if if you're unexperienced an edge is where you deliver data to be you want that edge to be as close to the customer as possible because that's where the data lives and then the communication between the customer and what you're doing is really really small obviously the speed of light adds up the amount of hops adds up the amount of services that you have to remotely call adds up they all add up and they all add inefficiencies to the system so something like YouTube they want to be able to serve that data as quick as possible but their data changes constantly and relevance is almost directly tied with the newness of the item so it's like how do you even cash these things out how are you doing this so they must have such an incredible caching network that I can't even I can't even fathom what it takes to do that that just to me is just so impressive a million view hours in how many different uh resolutions with how much data what is a million view hours is it 4K million view hours along with 1080p along with 720p along with 1440p like that number is an insane number actually it is brilliant what you said which is for YouTube often the new thing is extremely important to show to everybody and so you can't rely on caching or or trivial kind of caching you have to like deliver the new thing as quickly as possible yeah i mean it's incredible so there's the entire system the the recommendation system that knows each individual human watching YouTube and it has to integrate into that the new thing while also caching this incredible cluster of possible videos that you're potentially interested in so and integrate into that ads right in the case of YouTube and Twitch and so on it's a really tough problem because you have to think like what is the cash hit rate on this because there's so because the problem now actually comes down to space like space actually becomes a real problem how many hundreds of pabytes do they have that they have to like what do we cache and where do we cache this right like the number I mean I think in the terms of like gigabytes or maybe megabytes like they have to think in in probably versions of bytes I don't even know the name for right like it's like such a different problem and that's why I said Netflix Netflix has a much easier job when it comes to caching so if you've never looked it up it's called OCA and that we know what videos we're releasing we know what videos are hot and specific specific areas it's a very limited set we're not going to all of a sudden get oopsies we got a million new view hours right we don't even have to worry about that as a problem instead it's like okay we know Stranger Things season 5 is about to drop we're going to pre-cash Str Stranger Things season 5 in every single OCA across the world because that thing's about to get hammered right and so it's like it's able to do such a different kind of decision-m than what you have to do with something like YouTube and and then Twitch is even more wild because now you're actually ingesting video and trying to make it go out all at the exact same time for all video and you have to transform that video from whatever format and whatever the bit rate is into something that's more efficient in the system like that hats off to Twitch engineering like cuz that is like some that's some serious work and here's some Lex coming out and tweeting about YouTube features so like there's a I listen you're not wrong on the features you ask for though uh I think there's this is this is an engineering problem of how do you allow fast iteration and addition of features that shouldn't have to be integrated or impact the whole codebase so at the edges of the codebase sort of improve on certain features without like having to consult the mothership uh of the code it's the large team right that's that's the fundamental problem when you get into YouTube size there is the team/organization that deals with data warehousing there's the team/organization that deals with delivery there's a team/organization that's like the middle layer how you even you know they're going to be like the little microservices to talk to these places then you have this front-end engine so like for to for a small feature you have to get middle team you have to get backend team you have to get all these things quick example Netflix um are you familiar with uh the dystopian Black Mirror yeah yeah okay season one episode one do you know season one episode one everyone who watches Black Mirror typically knows this episode okay yeah i don't remember what it is but forgive my language but they call it the Pigfucker episode oh yeah of course once you've seen the episode you will then know this episode well when Netflix adopted it I got pulled into a room there's like a VP a VP a product designer a VP and they said "Hey we're about to release our own version of Black Mirror season uh season 3." I think at that time we need episode 1 season 1 to not be the first thing people see so let's just reverse the season order that required me I had like 20 engineers I had to gather together to be able to have this happen and that's just the problem of big companies is that eventually every little thing has to become its own team and so even small there's no such thing as a small feature reversing the order of the drop down that selects the seasons is uh a meeting with a bunch of VPs and engineers that's really interesting i there's got to be a way to accelerate that the natural scaling of a company and the bureaucracy that grows yes slows that down but just having seen Elon work a lot his teams are able to like still keep it very fast even as the company grows there's got to be like a process to doing that especially for uh yeah for the pig episode like uh I don't know where that in the priority list but like for important things like that you should be able to do that quickly i don't know can you speak to like how would you do that well I can tell first how it was done remember so at a place like Netflix there would be I think that at that point it's called a product called Dexter i can't remember there's our actual like movie metadata warehouse that's going to be highly integrated with Hollywood that's going to be you know where that side is able to manage all that so I'm like hey you need the ability to mark things that need to be reversed because we're going to run into this a bunch and we did we ran into quite a few topical shows that all need to be reversed and all that and so it's like we need to be able to reverse episode numbers season numbers we need to be able to hide season or episode numbers like in the case of the Chelsea Handler show it was like a daily show so it's like you don't you don't need episode numbers you just need the latest one and so like there's this whole problem that exists and so it's like okay you need to work on that for your UI over there then you need to be able to store that data then we need to be able to go to the like the people that can actually get the video data out of that and provide it to our our uh our service layer i need to go talk to them and convince them they need to be able to give me the new methods and everything to do that then I need to be able to go write the methods to get it down and then I need to go to the UI and make that accessible now I need to go to website people i need to go to the mobile people i need to go to the TV people and so it's like you can see this thing like snowballing and for us the big thing that Netflix did that was so well is after I met with these people that were high level I was the c I was the captain i'm the captain now yeah so I went to all these teams and said "Hey manager I need I need an engineer we need to get this done within the next couple months cuz we got Black Mirror coming out." So she would go "Okay here you go the map team I need someone to help me with being able to get data out of the lomo for this." And so it's like "All right you're working with this engineer." I'd go to the VMS team okay I need this engineer i'd go to the billboard team i need this engineer i' go to all these little places to get all these little pieces of data and then I was the captain so I was like you're working on this you're doing this you're doing this you're doing this I'm doing this let's go right and so it's like that worked and we were able to go pretty fast for a big company and the fact that it required like 20 engineers to do such a simple task we were able to do it in like gosh I'd say about like 3 weeks worth of effort but that was still I thought that was amazing comparatively to how many people moved well because you have the freedom of the agency to do it you said the captain of the ship that's really powerful for big companies that's a risk because you can it up you might not see the bigger context u legally or any and so the bigger context of the impact on the industry or all the contracts that are made all that so it's a risk it's a risk but it's a risk you have to keep taking and then if when you up you fix and then maybe pay the cost legally for that whatever but the long term that risk pays off because you're going to keep creating a better and better product evolving where the industry is going constantly innovating ahead of where the industry is going and so on yeah and not only that I think one thing that is just so important is that yes the product will get better but the people that you hire and the people that you keep around are better because they're the ones that show maturity they're the ones that can just you give them something and they can rally the troops and make something happen like that's a very great group of people to hire and so you also naturally select out great engineers that aren't just simply good at coding they're good at coding and they're good at explaining and they're good at convincing and they're good you know like you have to you have to create a very lean audience that can move fast and I think for great engineers having to wait for like okay let's schedule a meeting for next Wednesday with the with the VPs and that destroys their soul and they either don't want to contribute anymore or they leave the companies or they just kind of tune out and take the golden handcuffs and just you know buy a nice house and focus on uh family and I feel like I would die under that like honestly like that is that is my death sentence is where it's just that there's no reason to try there's no reason to do anything i'm just going to go in there like effectively zombie through my day and call it like I don't want to live like that i want to feel like I'm trying to do something uh I should also mention on top of that so you've brilliantly laid out how incredible the challenge that Netflix has to solve on top of that with YouTube you know the metadata thing because users are able to upload video and there's an API where they can upload automatically and change all this kind of stuff automatically every one of those things is an attack vector as we mentioned that's something they have to consider seriously on the engineering side and on the sort of the legal side they can get into trouble in all kinds of ways so they have to consider all of that so it's just yes fascinating the legal side is obvious but it's not really like I would never have initially thought someone would say upload images that you're not allowed to own or have but that guarantee you that happens then you have the whole kid side right like think about when you mark something as kid-friendly how many times have they snuck porn into a Taylor Swift video or whatever it was that was like a few years back there was that whole Taylor Swift or whatever i forget what it was i thought it was Taylor Swift but there'd be these mock videos that come up and then boom it's like that's a that is such an awful problem and I'm so happy that is not a problem I have to try to figure out yep okay so yes YouTube and uh and Twitch and Netflix are doing an incredible job