Transcript for:
Depth-First Search (DFS) Graph Algorithm

in this video we will take a look at our first graph algorithm that first search or more informally DFS this algorithm is one of the most fundamental algorithms in graph theory and many of the ideas in this video will carry over to more advanced graph algorithms for that reason in this video I'm going to spend some extra time building intuition for the ideas behind this algorithm here's what you can expect to get out of this video we will first introduce the concept of a graph traversal and what that even means then we will try to build our own graph traversal method and show how it intuitively leads to core ideas of depth-first search once you have an idea behind what depth-first search is trying to achieve we will then spend some time going through the implementation of DFS in a visualized manner and then we'll finish off this video with some really cool applications of depth-first search that I'm sure you're not gonna want to miss so make sure you stick to the end of the video let's get started given a particular graph one useful ability involves starting at a single vertex and finding a way to get to every other vertex of the graph the formal name for this ability is a graph traversal which is simply defined as an algorithm that visits every vertex of the graph there are several graph traversal algorithms and they are primarily differentiated by the order in which they visit vertices in this video we will focus on the DFS ordering so now that we know the basic idea behind a graph traversal let's take a step back and see if we can sort of invent our own graph traversal the line of thought that I'm gonna present here is actually exactly the DFS algorithm the goal here is to show you how the DFS algorithm is really a surprisingly intuitive idea something that with some thought you probably could have come up with on your own let's take a crack at it with this graph given a start vertex we want to find in order that we can go through the vertices of the graph so that we guarantee that we end up visiting every vertex let's start with vertex 0 from vertex 0 we have a couple of options for how we might want to continue the option we choose doesn't really matter too much here since our hope is that we'll eventually explore all the vertices anyway let's say that we choose to visit vertex 1 from here what now well now that we're at vertex 1 we have a new set of choices for where to go next it doesn't really make sense for us to go back to vertex 0 since we've already seen it so let's pick a new vertex to explore let's go with vertex 2 and from vertex 2 we only have a couple of options namely going to vertex 0 or vertex 1 but we've already seen those vertices right so what we might naturally do is say ok there's nothing further we can do from this vertex so let's take a step back to where we came from which is vertex 1 and from vertex 1 we now have two unexplored vertices that we can then go to let's go to vertex 3 from vertex 3 a path 4 is pretty set in stone the only unexplored node is vertex 5 so let's go there from vertex 5 we have a few options on how to continue traversing the graph and let's go ahead and pick vertex 6 at vertex 6 we've now reached another dead end similar to what happened with vertex - so let's retrace our steps back to vertex 5 from vertex 5 let's say we pick vertex 7 as the next day we visit from vertex 7 the only path Ford is the vertex 8 and from there we again have only one choice leading us to vertex 9 at vertex 9 we've now hit another dead end so let's go back to vertex 8 now what's different this time is that from vertex 8 we have already explored everything there is to see so let's retrace our steps once again we got to vertex 8 from vertex 7 so let's go back to there from vertex 7 via the same story so we go back to vertex 5 and everything from vertex 5 has been fully explored so from vertex 5 we now go back to vertex 3 which is the vertex that godess the vertex 5 same story here so let's take one more step back to vertex 1 and now that we are back at vertex 1 we have finally found a new undiscovered vertex so we now visit vertex 4 from vertex 4 there's nothing new to explore so we go back to vertex 1 which now also has nothing more to explore which finally leads us back to vertex 0 which now is also completely explored and now that we're back to where we started that's how we know we have traversed every vertex of this graph so I know that was a lot but the philosophy of how we explored this graph but continuing to explore new vertices as we see them until we hit a dead end is what depth-first search is all about when performing death first search think of yourself as a bold and adventurous Explorer where every time you see a new vertex you will continue to explore new vertices until you hit a dead end in which case your retrace your steps and continue exploration it's worth noting that there are actually several valid depth-first search orderings for this graph let's go through one more valid ordering as an example and also as further familiarization and practice with the core idea behind DFS we can generate another ordering by just changing up the choices of vertices to explore whenever we have a choice while sticking to the core philosophy let's start again at vertex 0 but now when we have a choice between choosing vertex 1 or vertex 2 as the next node to explore let's pick vertex 2 last time we went through this if you remember we picked vertex 1 again I want to emphasize that both of these choices are valid and if you were ever as for the DFS order on a test or an exam the instructions must be clear on how to make these choices a common way to do this is to enforce the constraint that you must always pick the node with the smaller value when you have a choice in that case we would have picked a vertex one anyways back to what we were doing after visiting vertex two we only have one choice which is now to go to vertex one from there let's say we now visit vertex 4 which then leads us to a dead end so we go back to vertex 1 now the path forward leads us to vertex 3 which then forces us to visit vertex 5 where we now have several choices let's pick vertex eight this time from there we'll choose to explore vertex 9 which leads us to a dead end so we go back to vertex eight the next vertex we then explore is vertex 7 which leads us to another dead end so we go back to vertex eight there's nothing more to explore from vertex eight so we go back to vertex five from vertex five we then explore vertex six which is the final vertex of the graph that we haven't seen so that completes another possible DFS ordering let's now talk about implementation it helps to try figure out the implementation on a smaller graph since it has fewer moving parts here's one valid DFS ordering for this graph one thing that you may have noticed about this process through the examples that we've done so far is that it's actually naturally recursive we started a plucked up core node and once we visit that node it's almost like we can ignore that node in the graph and run DFS on all of the neighbors of that node so this inspires one initial idea for how we might implement DFS given a particular graph G and a starting vertex V we first visit V then for each neighbor of node V we can run DFS on the graph with that neighbor as a new starting node remember that we are representing graphs using adjacency lists so an operation like finding the neighbors of a node is quite simple so let's try the first attempt at an implementation see what happens we start at vertex zero and let's say that the first neighbor we end up picking is node one so that's the next node we visit we then go through the neighbors of node one and repeat the process but there's one big issue with this implementation take a second to see if you can figure out what the problem is the issue here is that when we iterate through node one's neighbors one of those neighbors is node zero this means that with this current implementation we'll end up calling DFS on node zero once again which would create an infinite cycle of recursion so how can we fix this well what we want to do is ideally only call the DFS function on nodes that we've never seen before one way to do this is to essentially mark nodes that have already been visited we can do this by maintaining a list of boolean values where each one is mapped to a vertex of the graph initially all these values will be false since we haven't visited any vertex but as soon as we visit a vertex we can mark the vertex as something we've seen before by changing the respective index to true now with the vertices marked appropriately we can now call DFS recursively on a vertex if that vertex is not currently marked let's dig a little bit deeper to verify that this implementation actually works we first called the DFS function on our graph with starting vertex zero this will then visit vertex zero giving us the first vertex of our traversal after that we marked the vertex and begin looping through the neighbors of vertex zero let's say the first vertex from our neighbors we look at is vertex one what happens now is that we make a recursive call to our DFS function with the new starting vertex of 1 the first thing that happens in this recursive call is we visit vertex 1 which becomes the second vertex in our DFS traversal now the key idea is that when we look at the neighbors of vertex one vertex zero is already marked so we can't make another call to vertex zero so the only remaining option is to recursively call our function on vertex three the process is exactly the same we visit vertex three which ends up being the third vertex in our DFS traversal now from here we can go to vertex two or we can go to vertex four and the choice we make here is arbitrary let's say the algorithm picks vertex four first this means that we have another recursive call with their starting vertex now assigned to vertex four we then visit vertex four which is the fourth vertex in our traversal now at this point in the recursion we now look at all of vertex force neighbors which includes only the vertex three but notice that vertex three is already marked so what that means is at this point there's nothing further to do in this call so we go back to the previous call of the DFS function on the starting vertex three we had just finished our call to the neighbor vertex four so let's now evaluate the next unmarked neighbor the only remaining unmarked neighbor is vertex two so we can make a recursive call to the function with starting vertex two in this recursive call we visit the vertex and that becomes the fifth vertex in our traversal there are no more unmarked neighbors for vertex two so we can go back to the call on starting vertex three vertex three also has no more unmarked neighbors so we can go back to the call on vertex one same story here so we go back to our first initial call on vertex zero which also has no more unmark neighbors which means at this point the DFS traversal is complete pretty elegant and simple algorithm overall and I hope drawing out the recursion gives you a better understanding for why this algorithm works there's one more implementation of DFS that I think is equally important to understand this is an iterative implementation where we have the same concept of marked nodes as before but the key difference now is that we will use a stack data structure to keep track of nodes to visit so at the start we will initialize a stack with just starting vertex and then while the stack is not empty we'll pop off the last vertex of the stack if that vertex has not been visited before we will visit the vertex and mark it accordingly afterwards we'll iterate through the neighbors of this vertex as we did before and insert them into the stack as long as they haven't been visited before the best way to understand this implementation of DFS is by seeing it in action when we call this function on this graph with starting vertex 0 we start off with a stack that contains just vertex 0 we then remove this vertex from the stack and this is consequently the first vertex that we visit and Mark in the graph as a result it's the first vertex of our DFS order now we iterate through all vertex viewers neighbors and since none of them have ever been visited before we add them all to the stack we then go back into a loop and pop off the last inserted vertex which happens to be vertex 1 this vertex has never been visited before so we now visit this vertex and mark it this is the second vertex in our order now as before we go to the neighbors of vertex 1 vertex 0 has already been visited so we don't insert it into our stack but since vertex 3 has never been seen before we'll put it in our stack notice that vertex 3 shows up multiple times their stack but that's ok the marking of vertices ensures that they won't be visited multiple times the next vertex we pop off is vertex 3 which we then visit mark and add to the order as the third vertex vertex 3's for neighbors but only neighbors two and four have not been visited before so these are the next two nodes we add to a stack after adding these notices stack the next vertex we pop off is vertex four which means it's now the fourth vertex in our DFS order vertex four is only one neighbor in vertex three but since that vertex has already been visited we do not need to add it to the stack the stack is still non empty so the next vertex we take out is vertex to this vertex hasn't been visited yet so we visit mark and add the vertex to our order we've now visited all the vertices of the graph but there are still two vertices in the stack the algorithm will not add any more vertices to stack and will eventually pop off the remaining vertices from the stack but not visit them since they've already been marked the algorithm will then terminate when the stack is eventually empty while this algorithm is a little lengthier and more involved than the recursive implementation using the stack in this manner is a really helpful way for you to organize information if you ever need to identify the DFS order of a graph by hand on a written test or exam so you've seen two implementations of DFS thus far and I want to make an important note that for all practical applications both these implementations are equivalent in the sense that they both run in the same running time of our V Plus E time where V is the number of vertices in your graph and E is the number of edges as a result from a performance perspective if you run either of these algorithms on a huge graph you'll see no noticeable difference however the reason I want to show you both versions is that the recursive implementation is the more standard implementation of DFS you'll encounter since it's cleaner and easier to read but at the same time including the iterative approach is important since it's more generalizable to other graph traversal algorithms that require an iterative approach you end up seeing this logical structure a lot in other graph albums so I think it's a good idea to introduce it here on the topic of comparisons let's make a quick note about the differences between a pre order and post order DFS traversal the orders we have looked at thus far are all pre order DFS traversals all this means is that we output a vertex as part of the order as soon as we encounter a new vertex a post order traversal is slightly different in that we're gonna output the vertex as part of the post order only after there is nowhere else to explore from that vertex for example in this graph the first vertex in our DFS post order is 2 since that was the first vertex where we hit a dead end and had to go back the next vertex where this happens is vertex 6 which then ends up being the second vertex in a post order traversal and eventually we reach another dead end at vertex 9 which then becomes the third vertex in our post order now vertex 8 is also a dead end so we can add that to that post order the previous vertex 7 is now also in dead end and this continues until we get to starting vertex here's what the rest of the postorder traversal looks like and as with pre-orders reversals there are several other valid post order traversal for this graph the important idea here is to make sure you understand that a DFS post order will only add a vertex to the order when there are no more neighbors to explore at that vertex implementing a post ordered DFS traversal is actually quite easy once we have a pre-order implementation this is what the recursive pre-order implementation of DFS looks like that we just went through to make it a post order implementation all we have to do is instead of visiting a vertex as soon as we encounter a new vertex we will now visit a vertex only after all its neighbors have been visited this idea is reflected by just moving the call to the visit function to the very end of the DFS post order function and that's all there is the DFS post order in the last part of this video I want to talk about some of the applications of DFS DFS with some additional modifications can solve a ton of interesting problems in graph theory one example is cycle detection this problem can be solved by applying DFS with the additional check that anytime you encounter a new edge that points to an already visited vertex you have a cycle another example is finding connected components of a graph which just requires running DFS multiple times the main idea behind finding connected components is to start by running DFS on a random vertex DFS will then find one component the next step is to then run DFS on an unlisted vertex which will output another connected component continuing this until every vertex is visited will give you all connected components of a graph another more involved application of DFS involves finding a topological sort of a graph this feature applies to specifically directed acyclic graphs imagine each vertex in this graph represents a task if an edge points from vertex zero to vertex one as it does here what that means is that task zero must be done before task 1 a topological sort basically answers the question of how we can find a valid order to execute tasks in a directed acyclic graph this particular graph has several topological sort one of which is to start with task 0 which allows us to execute task 1 and task 4 from here we can move on to task three and then task 5 from task 5 it's important to note that we can actually execute task 8 until we finish task 7 so we execute task 7 which now unlocks task 8 and task 9 after this we can go back and complete task 6 and task 2 again this is just one possible ordering there are several other valid teleological sorts this problem can actually easily be solved with depth-first search specifically a DFS post-order traversal with a few modifications a DFS post order on this graph will give you the falling as one possible output do you notice anything interesting the key idea is that reversing the post order traversal of a graph will always give you a valid topological sort of the graph and I encourage you to think about why this trick works this is a fairly intuitive justification for why this works that I'll leave to us some food for thought the last application I want to talk about is kind of just a fun one that I learned about while doing some research and exploration for this video turns out that you can easily generate some really cool mazes with DFS most mazes have a grid as a foundation which we can easily represent in a graph now with this graph representation we can run a modified version of DFS from a particular starting point such as the top left corner and check out what happens what's really cool about this is that the only difference here is that instead of inserting neighbor nodes one at a time in a particular predefined order we take the set of neighbors of a node and randomly insert them into a stack when I generated this animation it was just a couple of lines to in difference from the bare-bones DFS algorithm and it really does generate a difficult and beautiful maze I had a hard time solving it without cheating so good luck for those of you who want to try that's all for this video and thanks for watching if you enjoyed the content please hit the like button so that this content will be recommended to more people if you want to see more content like this please don't forget to hit the subscribe button and if you want to more directly support the work of this channel please check out the patreon page length in the description below I'll see you on the next video