I'd like you to design either Tinder or Bumble take your pick [Music] hi everyone and welcome to another mock interview with exponent today I'm here with Yen Lee Yan you want to introduce yourself yeah hi everyone I'm Yen I'm currently working as a senior software engineer at Tech talk um we're on the mobile reliability team and previously I was a meta for four years working on Android messenger and also arvr for the Ray-Ban stories of Facebook released um yeah I'm happy to be here uh doing a mock interview awesome we're happy to have you so my question today for you yen is I'd like you to design either Tinder or Bumble take your pick before I jump straight into the design uh let me go over the functionalities and requirements so um I would like to list out kind of like the functionalities that we're going to be touch on uh today so for example I think that when I think of a app like Tinder I am thinking about like a profile that will probably be creating when the user first joined um and probably like a feed slash recommendation and then like matching so like this swipe right and if two users write swipe right on each other will be uh matching that to users on the back end and then lastly private messaging so once two users are matched they'll probably be creating a thread for texting and exchanging messages uh possibly uploading images or videos in their private inbox so images videos and then uh just kind of going a little bit more detail on the profile creation I'm probably like a bio so like preferences and then uh images for the their profiles and then potentially videos maybe that's an extra um so these are kind of the core uh functionalities that I'm thinking of let me know if I'm missing anything I think this is a good start yeah okay um and then maybe I can uh uh here I can maybe add like a little extra if we have time so something like super alike that I so I personally not been an active user of Tinder but I kind of know what super luck is but um and then system monitoring like logging just kind of Health uh state of the overall system and then maybe like a subscription model uh which I assume is how they actually monetize their platform yeah yeah and then I'm curious I'm curious why system monitoring and logging is under is under extra like what What affected that decision to put it there um so it's more for the sake of the interview just because I think logging system and monitoring is pretty much like its own sort of like design and um for the sake of the interview I uh I think it's a core component but I I don't know if we want to dive super deep into this part of it but yeah okay sounds good so next I think we can discuss a little bit about the traffic so traffic I think roughly I'm thinking um from like a social media app perspective I'm thinking maybe they have 50 million users um maybe uh I'm just guessing here a million active user per day so uh not everyone is actively using Tinder but there can be only 50 users registered um and then let's say 500k new profile I might be underestimating numbers uh created and then like a billion matches per day let's say um the number is kind of roughly estimated uh it could be a little on the lower end um but essentially this should let us uh have a little rough estimate on like the storage that we need for all the images for the profiles so let's say there is 50 million users and 1 million active users and let's say like 500k new profile create every day then uh I think we can probably do some calculation on uh the photo storage so for in terms of profile uh uh photo storage let's say um 200k um per photo and then there's let's say on average six photos per user and then there's uh 50 million user roughly and then plus like the 500k new profile created uh I think roughly um we'll be looking at like like 300 um 300 to 600 TB I think um and then for messaging there's probably way more here so I think for messaging we can say there's 200 million messages Exchange and um 10 million photos uploaded every day and then we can do more deep dive into like the storage part when we get into that section of the design um yeah overall I think this is kind of the background functionality and requirements that I'm gonna base off of my design if uh do you have any other questions before I dive right into the design portion no I think we can again we can dive in so let's start with the let's just go down from like the the list of uh functionality here let's go from profile creation so um here I think we're thinking of creating a profile and then um let's say we need to upload the images and also creating storing all the preferences and like the bio information so um let's say from the client perspective we'll start from the client side so client said the app and I think that the flow that I'm thinking of is when user are kind of setting up their profile they'll be uh kind of answering a series of questions so those are kind of like in like a text form with like I'm thinking like a Json format but then they'll also be uploading images so what I'm thinking is they'll probably first need to upload images before they can actually successfully store everything together so um hear the flow the general flow that I'm thinking of is they'll probably need to upload and their profile images to like upload services and then um then it touches like the application service so this will be sort of like uh transcoding the images or resizing the images to make sure that we don't store like the original size and we can save a little bit of space and then doing some other kind of processing on the image to ensure like whether we want to filter or do some encoding or something and then we can store it into like a back-end blob storage um and then this is kind of like the image Pipeline and I would assume that it would kind of return back some sort of image ID for each of the images which we can use for the overall profile of Json metadata so I think here uh before I dive into the design of the whole profile image I'm just gonna outline what like a profile data kind of looks like so for profile data I'm thinking there's going to be a profile ID which is the string and then user ID uh also a string and uh just secure kind of noting that I have a profile ID and a user ID because uh if the user decides to delete their profile once they do find like a partner and then if Unfortunately they emotionally need to break up with them and recreate a profile uh they'll be a different profile ID for the same user correct and then for preference um there's another layer of um Json so like for example gender uh string and then if they have like a low like a h-bound preference um and then other kind of file preference that they have um I won't list all of them and then some sort of Bio information about themselves uh so their own gender age and then location information and then maybe other types of bile information and then photo IDs which will be just a list of IDs so the list of IDs are coming from the back end server once the uploads have been successful so only when they are able to upload their profile images while we actually finish filling out this profile data and then um once they do that I think we can go ahead and kind of fill up the second part of this pipeline which is uh creating that profile into inserting that into the profile database so again I think here uh there's kind of like two and so profile user profile DB which I'm thinking of using sort of like a a nosql database uh to Pro uh to kind of capture this uh kind of unstructured sort of Json format Okay so here they would go from some sort of like a htvd post request and then um to the application server which will uh end up storing do any sort of processing it needs to um this sort of General lines which uh we can dive a little bit deeper but it will end up storing into um the profile DB and like I mentioned earlier like this is also like where we would be talking to like encoding service for like the um for the uploading service okay so this is pretty much like the general design flow for the profile creation and then um I think I will go on to the next part which is feed and recommendation and um I think after we've gone through kind of all the core components we can go back and talk about like scalability adding load balancer or like additional cash memory caches stuff like that um yeah so the next part is uh let's see feed recommendation so uh this part I think I kind of want to talk about sort of like the uh small many requirements of this area so um we want to first kind of like talk about General requirements of like our feed recommendation we would like to recommend um sort of like local users so we don't want to recommend people who are kind of across the globe and we also want to show active users so uh like I mentioned earlier there could be like 15 million registered users and some of them are within your area but they are not actually actively using so we do want to make sure that we're showing active users when it when it comes to feed recommendations yeah and we also want to make sure we balance between kind of like showing too many um matches versus not enough matches for each user so what that means is uh sort of like if one user is getting too many matches um then we kind of want to make sure that they are not uh getting recommended even more um and whereas some people are not getting any matches so we want to make sure we are focusing on uh adding more kind of like um I guess like reducing more uh requirements so reducing some of the filters or kind of loosening up those requirements when it comes to the recommendation um algorithm okay um is that is that like a product inspired um direction that you're going in yeah yeah okay and this could be kind of an extra thing but uh just kind of listening now and then maybe some sort of like so I think we want to prioritize either kind of like pretty fast recommendations So like um there's always like new matches or sorry new recommendations showing up so low latency versus kind of real uh time match uh recommendation I just say so this is kind of like where do we want to prioritize do we want to make sure that there's more recommendations coming in once a user kind of like swipe all that their profiles or do we want to make sure there are more users of the uh kind of newly created profiles that are showing up so I think this is something that we can discuss a little bit more I'm more leaning towards low Lindsay just because I feel like it doesn't really matter if we're kind of late to some of the newly created uh profiles if we eventually will show them but that is something that we can decide yeah so yeah so these are some of the things I'm gonna kind of keep in the back of my mind when I'm um designing this part okay um yeah so first I would like to kind of talk a little bit about how we can recommend the local users so one thing we can do is to kind of do like a database sharding based on geographical locations so since every user has provided their kind of like location so latitude and want to do we can use that information to sort of do like a Geo sharding when it comes to a database destroying their bio information when um when they are creating their profiles so using this sort of approach we can then perform a pretty fast look up and also make sure that we only retrieve information related to users that are surrounded by you and let's go ahead and sort of map out this design again we have the client the app and then let's uh this way so so we have a recommendation service okay which uh the client will be sending like a request so it's for a recommendation list and then so this part um I think uh to make things a little bit easier if you all understand we can go with a Geo sharding mapper okay so this uh allows us to kind of quickly grab or store file into the correct um Shard and essentially it's using um users latitude and longitude to kind of um uh store like a mapping between that and the sharding index and be able to kind of correctly figure out which Sharda we should be uh talking to okay and then um so this can be so so essentially the shorting mapper is kind of like a more like a wrapper interface to kind of talk to a an actual Library so for example we can use um like Google S2 is one of the I think popular Geo sharding uh database that we can be using so here we can kind of uh use that as our actual uh storage for achieving the Geo sharding and the geosharting mapper will just kind of directly talk to Google S2 and store file into the storage or get all the shards uh needed and then um and I think that it will also just kind of talk to the shards so essentially here the uh design is that we will the client on the app will send like a get request for the recommendation list uh providing latitude and long and then the Geo um and the recommendation service will kind of send the location and other information needed over to the geosharting mapper and so here we have Zen thing location slash other info and then so here we kind of get all the shards needed for the current uh location so for example if we are looking at a user base in New York and it's the city is pretty dense so I would assume that uh there might be multiple uh cells or servers uh storing relevant information that will be needed for the user to use for the recommendation service so here we can access that by just talking to the US to interface and it will kind of access the right shards for us and then come back with the uh store information so okay so uh kind of making that into a more explicit API so we can say get recommended feed is our end coin and then we provide the user ID a profile ID so here actually if we think about it we uh request send get requests so instead of providing the lists we can just provide simply the user ID and the providing of the of the clients that you're sorry of the um users that you're returning from the recommender uh so so no so we are providing the um the user ID and providing of the user that we are currently um requesting the recommendation list from okay and then so by providing that it will send it to the server for the recommendation servers and then they will use their user ID and profile ID to retrieve the relative uh information store on the server side so if they're providing uh information has all the uh file information so for the locations of it will retrieve that recommendation and then send the location and other info information to the Geo sharding mapper which will get all the shards needed and then come back with a list of the uh other people users information back to the recommendation service and then it will do filtering on that so let's talk a little bit more on the recommendation service okay so this is the get request it will send and then um I think the response that we are hoping for is simply like a list of profiles that we wanna uh display and um a profile is simply I think earlier we had sort of like uh defined that here so if we can grab Alyssa profiles that we need to share to this user's fee then uh on the app and we can just simply display that into a UI so for the recommendation service um let's talk about what we need to be doing so we need to develop a strong kind of algorithm to filter out um sort of like a giant list of profiles or bio infos return back from like the Geo sharding mapper um and I think let's list a couple requirements what the recommendation service is actually doing so okay um um the first one is we need to make sure all the profiles being displayed on the feed is uh fulfilling the current users preferences so remember when we were defining kind of like the profile every user has like preference that they can set so as part of the recommendation service is to make sure we are filtering out the um the profiles that are not meeting these user preferences and then we should also only include active users so what this means is any users who have not been you using the app for at least more than um within like five days for example we will be filtering them out so filter out any user who has lost logged into the app more than X days ago okay so um this can be part of the profile information that we get in the back end where we are storing any um login information so when the user login will do like a push to the server to update their profile and add like a lost login field with a server time okay and then this will allow us to also recognize who are the active users um and then we should ensure that um the list has like a higher limit so for example we don't want to be um we don't want to be returning back like 2 000 um profiles back to the user because then they will not be able to like half the time to scroll all of them and by the time they go through all of them it might be kind of old stale so hardly limit on the return count so let's say 50 is the count okay could be dynamic set by a setting so um sort of like very basic and I think that uh let's go back to what we have here we are recommending local users so by using the Joe sharding algorithm we should be able to achieve that show active user so this is a part of the recommendation services uh balancing between matches so this is something extra that we have not been touching on so this can also be included in part of the recommendation service where uh users are also filtering through to make sure that they've only been um they've only been matched a certain number of times otherwise we can have a smaller limit on the uh returning count a low latency versus real-time recommendation I think by using the Geo sharding algorithm I think we will achieve pretty lowly and see just because we will also always be accessing uh the shards that are relevant to us yeah I'd like to hear a little bit more about this low latency decision so I remember you touched on a little bit earlier what factored into that decision that you made yeah so I I guess this is kind of like a personal decision on on this but just kind of breaking down the two different approach so low latency versus real-time recommendation so low latency uh is sort of like when it comes to the app uh kind of running out of recommended feed so when the feed is kind of empty when you have no longer any profiles to swipe on um when the fee needs to retrieve more recommended uh lists I think when it comes to low Lindsay versus real-time recommendation real-time recommendation uh sort of requires like a set of different um algorithm and um storing mechanism where could you define could you define the two first uh the two like what it means yeah like what is the product behavior on either of these options sure so low latency I think it means when you are running out of uh recommended feed for example if the user swipes left uh just really fast and went through all of their feed entirely in like a very short amount of time and um the app needs to request a recommended list from the server and this the waiting time here if it's for example if it's under um a few seconds um let's say 10 seconds um I think I would say like within a couple seconds it's all pretty relatively low latency when we're talking about such large scale and doing sort of back-end processing on filtering and everything so I think within seconds user can kind of go back to swiping versus real-time recommendation I'm thinking of uh where the the back end needs to do more processing and could take up to minutes to kind of curate a more real-time recommendation on user who are both relevant to you but also uh have like new profiles created and why are we why do we care about the new profiles created is that because they've just run out of people in the geographic area that they've specified yeah I think the new when it comes to real-time recommendation like new profiles I think it matters to Tinder or like kind of dating app just because um it kind of ties into the active user part where uh we need user who are actively using and who are fresh to the app as opposed to people who could be kind of like three day oh or um not actively using the app um to create like more matches right so the point is creating more matches got it so we're saying we're optimizing for even if it's like you know people who are not Super Active even if it's not the best match for you we'd rather just give you people ASAP for you to see beyond the initial list that we gave you then make you wait for the best possible matches yeah yeah got it okay and and I think the decision is kind of subjective I I would say I I think in general from like the social media aspect I know that uh at least from my experience uh people users are pretty impatient so um kind of design that I've always opted for would be kind of like low William Siva sacrificing the quality of the algorithm uh result here okay yeah that's fair yeah so yeah so that's uh kind of where I was going with and uh let's see yeah so going back to the recommendation Service uh these are kind of like the base basic algorithm uh filtering system we have and then um we can add a bit more on the balancing between matches and then the lowland C versus real-time recommendations when it comes to uh using the Geo sharding this is a good place to stop stop much yen post no I think that I this does a huge space like I mentioned I think monitoring and logging is like a whole other area yeah if for people who gets broad questions like these in their interviews what are what are your recommendations to them how do you recommend they approach it um yeah I think I would always go from kind of big to small so breaking down from the general requirements functionalities and focusing on a couple components and then from there kind of apply the same thing just kind of breaking down um like here I think even in this space I was doing breaking down into kind of different components and different requirements within each of the these sub-components and then focusing on different things and I think as an interviewer I would also kind of do the same thing if the interviewee is kind of focusing on something and going too deep into one thing kind of pull them back and ask about something else or kind of hint towards like okay let's what about this other core functionality that you mentioned yeah yeah okay that's great thank you so much yeah and it was wonderful having you um and I hope people are able to use this to build their own dating app so we'll see foreign [Music] foreign