Transcript for:
System Design for Front-End Engineers - Facebook News Feed

hi welcome to the channel um and welcome to the first series of the system design series for front-end engineers um currently i'm preparing for instant front-end interviews into thank company and i struggled in preparing the system design part because usually we don't have any information on the internet about the system design for front-end um so there are lots of courses about the back-end system design but with the front-end we have some problems so i've decided to make my own videos about that so this this is not an ideal uh content and it's not a course it's actually just my thoughts and my preparation for this interview so uh the the first problem which we are going to see is the facebook news feed this is actually very widespread problem for an interview and let's try to design it in like limited time frame by 15 minutes okay so first we need to start with the plan of our system design and the first let's just try to summarize uh the whole stock we are going to do so we need to start with the general requirements and the second thing to uh to deal with its uh some specific requirements about platforms and uh in our plan there so when we have all the requirements and we have the numbers we can start thinking about componentization of our feature and this the third point is components architecture so the fourth point is so each our component work with the data and we need to understand which data we use on our frontend site so this is going to be data entities so the next point is uh we need to uh fetch this data and we need an api and we want to have some featuring points inside our architecture so the five point is uh data api so word next so we so we have our entities we have our api but usually applications have the front-end storage and it's important how we organize this store so the sixth point is the data store what about the sounds point so the main feature uh of the uh facebook news feed is the infinite scroll and we are going to see how we can implement this so the next point is infinite scroll eighth point uh so we have the whole design but we need to understand how we optimize our application for that so eighth point is optimization [Music] and we want to our feature to be accessible to whites for uh so wide range of uh devices and also to people with some disabilities so the eighth fold we are going to see is the accessibility [Music] so we've correct we have all our structure for the system design let's come up with the require with the first step it's a general requirements so general requirements actually is uh about the feature we want to build so we want to have the uh infinite infinite scrollable news feed where still where stories [Music] appears based based on the user subscriptions from group pages friends and so on [Music] so we want the user to share these stories and also to send to send the data with this source like images and links and so on so a user can share this story [Music] user can post the story and attach comments links images and videos so what what else we get so we talked about the infinite scroll and the basic features we want to have and so we can go to the specific requirements so the specific stuff like functional requirements it's uh about the devices and some support problems so we we want uh the feature to be accessible on one wide range device on wide range devices and we want uh we want the uh the feature to be accessible for people for people with some disabilities [Music] uh okay so we're actually done with the requirements staff and we can go to the next step of the it's a components architecture so how actually these story looks like on the on the front-end side so basically we have the story component which contains the avatar title uh date of posting some text and images and there is end control panel like like comment and share beside that a user also can post the comments down below so we have the also we have the comment list so this basically this is the mockup of uh of the story and newsfeed is actually a list of the stories so we have multiple of them uh this is our like basic components we need to have uh and the basic data like for commentaries we have avatar and text and for the coming input we have some actions uh okay that's cool so we have the mock-up but uh to understand how actually the components organized inside the our source code we need to have the dependency graph where where we can see where how each component depends on another and this is also help us to understand the data flow in our application so let's try to to to schema this so i've prepared already all the schemas in order to save some times and i basically recommend you to do the same just to save time on the interview if you have if you have this problem okay so as you can see we have the news feed and the newsfeed will contain the uh stories components basically there are lots of stories there and the story has the story card comment list comment input and comment okay uh so as you can see here we can now think about the data we need to have in order to render this stuff so our next step is the data entities we need to have so let's just set it it's actually uh so i'll keep my labels here in order to understand where we have each part so this is a architecture okay let's go to the data entries part so as you can see we need to have the uh story data and comments and some images so how can we describe such entity so basically this story uh i'll use the typescript notation here so don't worry about that i feel like this is the most convenient way how can we describe the data so so what what is the story story has some id like the number for the stream doesn't matter uh also it has the comments [Music] and it's actually the comment rate um it also has some media which is the immediate array so media is a links video and so another entertainment stuff what else we have uh also we have the date the date of decoration and we going to store this as a timestamp um also we have the content which is the text user types inside um so we have the content uh um story also have the user [Music] a user can be quite simple let's say that it's a [Music] nickname id [Music] okay i'm sorry here it's actually not the user but the origin uh origin is any source of the story like the source can be the uh the source can be any type like uh page group and so on so we have the type uh which is the origin type and the name so what what else we need uh we have the media date and as you can see i have the share id so i think that's enough for for us um okay it also we need to have some a uh data for for example for user it's a avatar and for um for the group it's actually the group the group information and so on so this is i'll say this is the custom one okay uh then let's describe other entities we have so we also have the comment [Music] comment have the id it also has the media type it also have the offer of the comment and we'll keep this as their user id um media is they still immediate array uh date is a creation date content is actually our content okay uh so okay uh we designed the whole uh the uh the whole api here we need the type story comment uh we also have the origin type but it's not it doesn't matter actually for front-end design to understand the origin so we can skip it media media is just the data with the format let's describe it too so what is the media media it's any source of the immediate data and we have the limited uh possibilities of that so let's say that we have the type and this can be linked it also can be video and so on and also we have the url of that and this is the media okay so we've designed the data models we we want to work with uh the next step is so we have the next step is the data api so let's label it too say let's say data models let's go to the data api so what's api we need i also uh create a box here and try to come up with all the end points we have so the first one is the get posts and the guest post api is pretty simple we have the ip key which provides us access to our api we also can have the uh user id in order to get stories for the user uh the third point is uh for example we don't want to show the comments so we can provide a little flight like exclude the comments uh the next thing is timestamp or cursor it indicates the timestamp we need to load for data from for example if we loaded it for uh two hours ago so we want to have the data before that cursor [Music] so we have the cursor but we also want to use for example specify the page size and also the maximum id we want to fetch for example if we have the stories like number one with 80 1000 and we already loaded the stories uh uh with the id after 900 then we said that the max id let's call it min id actually let's see mean id is equals to 900 and this will be our range so we have the get boost uh we also have the post create a boost which also have the epic key user id and the post data the first endpoint is they create a comment which requires us for epic key user id and the post id in order to indicate for which post we would like to comment so what else it's actually this uh the common data okay we have three points and let's try to understand uh how with which protocol or technique we can call this api so we don't we don't have many possibilities but let's describe from the basic one so api can be called with a simple rest uh we can organize our api with the rest architecture or we can go with the graphql approach uh in the production graphql is very useful to use for like multi-level data because we can easily select the fields we need uh for the client and the rest here is not that agile and we need to create several resting points for that but uh the rest is actually very scalable because we have advantages of the rest of the http architecture like caching technique which goes with the http by default um with the graphql because graphql uses the post response post requests it doesn't work like from the box and graphql manages the caches by itself basically we can choose any of that and you won't be wrong here but let's let's say we want to use the rest here okay uh we just write endpoints and now we've fetched this data but the next step we need to organize this data on the front end with a remain so we define data models let's let's go to the next point it's a datastore so the datastore datastore on the front in the front-end applications we need to have the fast access the resources of the front-end is quite limited because the browser works in a different range of devices and we want to provide the fastest access for our data how can we efficiently organize this so basically we need to define the fetching points of our uh application where we fetch the data and how we pass it i'll copy the schema component schema we had previously and also added the store and as you can see here the blue rectangle it's actually the fetching point where we fetch the post with the parameters so we're fetching uh the data by the cursor like the time frame we want to have the stories and then we push them to the our front-end store but how we organize the store store on the front end is you know we can use it in the most efficiently if we organized in a normalized flattened state we don't have we don't want to have the multi-level like structures on the front end okay and we don't want to uh filter like large rays and so on so for that we flatten our store and normalize it by id of the feed so for example if we are loading the uh feeds then we save our feeds like in the map structure where id refers to the feed and also the feed idea first to the user and the media of the feed id also refers to the media for his feet and the origin also can be accessed by the by the feed id and the comments can be expressed by the feed id this provides us efficient access uh and let's go to the example so we have the news feed we pass the feed id inside the story and then we access the store with the fit id so we need the feed we access it by id pass this to story card we need the comments we access it by feed id and pass to the comment list and so on and the comment list have the each comment passed uh passed inside as you can see here uh we have just one fetching point and then we just pass down the data we need and access the store directly with inefficient inefficient way okay we have it uh but let's think about like edge cases uh do you feel like that having the running point for uh feats is actually enough so what if like okay i will label it like edge case so what if you're scrolling down and the store the new stories come up how can we load it uh the problem is that if we just uh keep our on executing the get post request it's we have the traffic overhead and we want to efficiently load new stories and how can we do that so basically we have those new stories we have several approaches one is the long polling [Music] and the second one is the websockets and the third one is the service side events and which one to choose let's describe it load polling is actually a technique where you ask the server for the data in in like set interval like in a constant one like 200 milliseconds which what we should disadvantages have the first uh and the main one is a traffic overhead long polling has the full it's a full request with all the tokens headers and so on and we don't want to actually just for like one story we don't want to send this whole overhead also long calling has the longer latency uh in the production applications especially for mobile for mobile phones which uses the network cells but it's not a problem for this uh news feed but but still its disadvantages um an advantage it's very simple technique because it's just a request we set the switch it up with a specific interval the next thing is the web sockets and the web sockets is a provides us a bidirectional data transferring and it's actually very fast and it's it's actually in real time but do we need low to load this in real time i think we don't and other like the major disadvantages here that websockets do do not support http 2 protocol and i think this is the main thing [Music] okay um besides the load balancing and other stuff uh i think that http 2 is the most like important for us and we don't need actually the real time uh data loading so the third technique it's a server side events what is server side events it's actually a very modern one i think subscribe for the events on the server and get updates in the binary format it works under http 2 protocol and can and it's very effective so we load only the piece we need in a binary format and yes it's a binary format it's not a json but we can parse this binary format very easily and it's not a problem for us but we get this piece of data without any overhead also a service side event is easy to load balance relatively to them calling and the web sockets of course [Music] and it doesn't have any problems of the uh log ball in the web sockets and fully is supported by http 2 it has the longer latency than the web sockets but the latency about 60 milliseconds is fine for us so we go we are going to store reload the stories with the server side events and let's add this to the endpoints we have [Music] not to get but let it be full like subscribe okay we actually uh build it the fix the edge cases for the api and then i think we can go far with our requirements and the next thing is the infinite scroll so what is infinite scroll infinite scroll it's like you have the set of stories and you want to scroll them down and definitely uh infinite infinitely and the new data appeared when you like go to the edges of the last stories um okay but why this uh i would like to start with why this partner actually is choose to choose for uh facebook news feed so we have the entertainment content and in case of entertainment content we don't need to to have the pagination or learn more about them imagination is used for the data we actually want to analyze or table data and the show more button it's like the variant of the imagination but we don't need to because this creates additional action for for the user and we want user to stay on our website and scroll more and more so let's describe this feature how actually this is built on websites so here are the schema so we have the stories like 10 one uh 10 stories and the basic approach to and this is the browser window like imagine that this whole rectangle is the browser we do and this and based on the device of the user we have the different viewport and some viewports actually can contain 12 12 stories 50 story cents home let's say we have the 10 stories available on the viewport uh when we scroll down we need to load more stories and for that we need to understand that we actually we actually at the end of the page how can we do that in browser api we have an intersection observer which allows us to check if the user report actually intersects with the sent with the some html nodes and for that we use the two nodes like terp sentinel and the bottom scanner node uh job sentinel uh works when you scroll up and you want to show another piece of data which was previously showed and the bottom sentinel works like you touch the bottom sentinel and reload more data and show it and so what is the top button and the bottom padding so imagine we the user loaded 500 stories it's a actually like 500 stories component and maybe 2 000 from dom elements and why do you think this will work like fast on any device and that's a problem we want to solve we don't we want to show the constant number of nodes in order to prevent the performance issues like we have large numbers of notes and the performance like on mobile phones will suffer [Music] so the idea is to have the sliding window so we have the data like store 100 elements and we want to show uh only the window which user currently see on the viewport so we have uh for example the viewport 10 stories when we scroll down we want to replace the stories with the new one and keep maintaining this number of dom elements and how can so and let's describe this with the picture to understand it better so here it's here is the schema so you user starts with a story one and for example 10 is the borderline uh then it's actually scrolls down a little to the story 10 and this is intersection zone this indicates when we need to load more data so what we are going to do uh we are going to change the window which user see so the the story 5 actually became the first story on the screen and the store the story 15 is the last one so we moved our window from 1 to 10 to 5 to 15 and show only this number of stories the violet zone here is the future zone which users currently do not see and this is the future data so we keep these 10 elements on the user screen just updating their data and this is like the main feature of the infinite scroll [Music] by maintaining the constant number of nodes we prevent all the performance issues here so what else can we say about it yeah i think um we are quite done here so we have the top boredom sent you know and this is actually fine for us to describe the basic idea of that so why do we need the top and bottom cutting uh when we scroll down uh we need to create an effect of that the user loads more data so when we scroll down we increase the top body and this creates an effect this is actually with all this more data and then when we scroll up we just decrease the top body but it reads the bottom one and this also creates an effect that we have the some data and these we keep the size of this page like with the if this if this data would be there but it's actually not there i feel like we can go further so let's label it too so we have the edge cases and we have the infinite scroll design [Music] okay we've designed it and the next thing is the optimization stuff okay let's go to the let's think about this so optimization [Music] the optimization of the websites and the performance of the website splits into several things it's a rendering performance or always start with the network performance rendering performance [Music] and the javascript performance and let's start with the network performance so the first thing we usually do is we optimize the our assets uh in order to load them faster and and this is a quite obvious big zip the resources we have we can go even further and use the like modern pro format for the browser which actually support this it's a broadly okay so we have we have the exhibit we have the gravel one uh the next thing to to do with the performance as load images so we need to serve our images with an appropriate format so if the brow and this is that b so if the browser supports webp we can serve it with the webb and if the browser do not support this we fall back to the png but this is not the end we can also optimize the image based on the viewport how can we do that so we you know there are several options we can create like the images of the several sizes but in the modern web uh we create the service for that so we have the image service and we send and we send the viewport to this service and optimized images actually come up to us uh and by the way we can also cache these images inside enter the caching cdn insert this from the cdn and do not actually generate any images because usually viewports are very similar on their own devices and the cached images appear on the cdn and also we do not want to uh serve our content to this from australia to europe and so on and we have the nice radio location with that with such services okay so what else we can we can add here with the network performance we can also improve this by switching to http 2. what is and why http 2 is important so http 2 is a more important protocol which enables many cool features like multiplexing so multiplexing uh it's actually the theme which solves the problem of http one http one had five connections uh at max and that's why actually the pack was built we needed to put everything in one bundle but we do not require to do that in http 2 so uh fcp2 have the multiplexing and with that multiplexing we can load hundreds of resources in parallel and this optimize the performance so that's what performance of the website significantly so what what else we can do here uh to finish with the network performance so we have the hdb2 and this enables us uh doing the enables us doing the bundles plugin so we do not we do not load uh every data at once we can split our application the bundles like for the news feed we can split um into uh feed then we can also have the for example header and we can also have the some vendors libraries and then we can have some analytics scripts and with that we can simplify the loading process so we can load all the resources at once and this is a cool feature so let's go to the rendering performance uh the rendering performance is one one important feature we need to have it's a time to first content we can fix it we can decrease this time by providing uh server side rendering for some pages because then for example we can pre pure render some feeds and then and then show them what else uh we need to we have the css and we have the uh images and this when the and we have the scripts when the browser will like see the css we want to we do not want to block uh the whole rendering because when the browser see the sub resource it loads it and then we render the page so rendering performance uh we can in line the critical style critical styles and also we can align critical scripts like the source thing to build without what means in line we can serve this inside the html so we do not load any uh data and so on also we can serve some some scripts we can sort some scripts with their correct uh loading so for example we can load scripts asynchronously so they do not block the initial render or we can some use the fir keyword to not to block the whole rendering and to wait for when the page is loaded and then we use some uh we will load some screens so javascript performance the javascript performance actually uh pretty simple we in order to improve our javascript we need to do less stuff [Music] yeah it's simple just do less stuff and do the stuff asynchronously [Music] and if we have some heavyweight stuff we can also cache results and if we do not we don't want to block uh the so the if the calculations with the pretty large data required we can block the whole or rendering uh the whole interaction with the website we don't we don't want to to do that but the javascript is a single thread language how can we overcome we can go with the service workers also direct workers and cache the whole stuff so and do some heavyweight job in the web worker okay so what left here so let's add to the rendering performance also [Music] css model strategy or class class naming [Music] strategy what is class naming strategy it's like um bam and uh csm and i need like theory how we can effectively name the models so large uh if you have much levels cs has lectures sorry so if we have the multi-level css selectors then we have the problem with the browser performance because the browser needs to pass this lecture and select some elements so keeping selectors simple is quite good for the performance okay i think we actually did a lot of job on the optimization uh but it was what else so in order to decrease uh direct so when the user see the pages and see the stories with the images or we can show the placeholder some scientific researchers say that if the loads take some time and we show the loader then the feeling of the time is different for the user and he and the user thinks that the image loads faster so we can show it also we can just do not render any image until actually the user uh viewport intersects with that [Music] so this is just called that for loading lazy limit images okay this is for writing enough here uh one one more thing is that we want to have the some pva pva mode so for example if if you go to the plane you want to preload your stores and then see them on the plane so we can enable the offline mode how can i do that it's very simple we can use the uh service workers with the we can use the service workers with the application cache and they can have cache the whole resources and we can enable all the data uh to be accessible offline like that so i will label this as a service workers okay we focused on that one last point to do is the accessibility so accessibility is a very broad topic to support likes different screen readers and so on but we can like enumerate the basic things so first thing to do as the we want to support people with support uh different uh color scheme of our application so to enable a user with different color blindness to use our website [Music] what else we want our application to be accessible for the screen readers so all inputs [Music] and text areas and and other elements should have a real life attribute and your life is actually which when you change some content the user screen screen reader voice over it and just and dictate it to the user so the user can see the changes inside there they input so images also need to have highlighted attributes and we need to have the hotkeys uh hotkeys so which hotkeys can be who can be added for the newsfeed so it can be like new storage for hotkey uh post story scroll down and scroll top so the five point is uh scroll down scroll top and call for help with all link hotkeys and the sixth one uh return to main menu which enables like uh to see like the quick access for it also sharing options okay we've supported the accessibility stuff and i feel like we did a great job here so with this we discussed each point like the components architecture data models api also data store edge cases infinite scroll optimization and accessibility and i feel like this is we are done here so thanks for watching this video i will provide you with this schema in attachments to this video to this video so have a good day and have a good javascript uh bye and see you in the next videos oh by the way uh comment please kinda leave the comments and we can make the content better together if you provide provide me with your opinion and how can we actually improve the whole