Graph Theory – Breadth-first Traversal and Depth-first Traversal

Hello all – given that myself and many other students worldwide are currently in the midst of the exam season, I haven’t been very active on this blog. I hope that you all will bear with me for the time being – when my exams are complete, I will be able to start making blog posts more frequently again.

As the title suggests, this post will be about graph theory, a very interesting and useful aspect of mathematics and computer science which is also encountered frequently in competitive programming. The graphs which we will encounter in this post are not the graphs one may associate with high school statistics! Instead, these graphs consist of a series of ‘nodes’ (sometimes called ‘vertices’), connected by ‘edges’.

graph

These graphs are commonly used to represent a set of points in space, and the connections between them. For example, a problem encountered in a programming contest may involve a set of towns connected by roads – these towns can be represented by graph nodes, and the roads can be represented by the edges in between the nodes. Edges in a graph can be both ‘directed’ and ‘undirected’. To illustrate this concept, we can think of two nodes x and y – if the edge between them is undirected, x can be reached from and y and be reached from x. On the other hand, if the edge is directed, only one of the two nodes can be reached from the other node (if we refer back to the analogy with towns and roads, a directed edge could represent a one-way street).

There are various ways of representing graphs in code, however, we will focus on using an ‘adjacency list’. This is a method of storing a graph by representing each node as a set of edges to other nodes – for example, in C++, an adjacency list could be a vector of vectors of integers, where each vector of integers holds the indices of all of the nodes connected to the node represented by the vector in question. Moreover, if the edges of the graph are undirected, the vectors representing two connected nodes would contain the indices of each other (that is, if the two nodes were x and ythe vector representing x would contain the index of the vector representing y, and vice versa). In the programs used in this post, we will construct our graph by first taking the number of nodes as input, and then taking the number of edges. For each edge, we take two integers – these integers represent the indices of the two nodes connected by an edge.

inputgraph.png

Many problems in programming contests require graphs to be ‘traversed’ through – this means that the problem’s answer will need to be obtained by moving through a generated graph. Two methods of traversing through graphs will be discussed in this post: depth-first traversal and breadth-first traversal. When these methods are applied in a program, the nodes of a graph are ‘visited’ in a certain sequence determined by the traversal technique. When we say that we have visited a node, we mean that we have obtained its index. An undirected graph will be used in our demonstrations – we will move through this graph and print the index of each node as we visit them.

The depth-first traversal technique moves through a graph in the manner suggested by the name – it goes ‘deep’ into the graph before looking at adjacent routes. The image below shows every node in a graph being traversed using a depth-first traversal starting at the node with an index of 0 (note that we could have started the traversal at any node).

DFSTraversal.png

A simple way of implementing a depth-first traversal is through recursion – if we represent the graph as a vector of vectors of integers, we first call a recurring function and pass it the index of the starting node as an argument. Then, we recur for every integer in the vector indexed by the function’s argument (since each integer in this vector represents the index of a node connected to the current node). In order to prevent our traversal algorithm from continuously revisiting the same sets of nodes (this could potentially cause a stack overflow), we can use an array of boolean variables which represents the ‘state’ of each node (that is, whether each node has been visited or not). If the boolean at index i of this array is set to true, then the node at index i of the adjacency list has been visited. Every time we are about to recur, we check to ensure that the index we are about to recur with has not already been visited. Furthermore, every time we recur, we mark the newly-visited node as visited by changing its corresponding boolean value in the array.

An implementation of a depth-first traversal using recursion can be found here.

Alternatively, we can also implement a depth-first traversal in C++ using a stack of integers (these integers represent the indices of the graph nodes) – the main operations we are concerned with here are the ‘push’, ‘top’ and ‘pop’ operations. A stack is a data structure which is available in the C++ – it can be used to store and retrieve variables in the same manner objects can be stored and retrieved from a stack in real life. If we imagine our stack as a stack of sheets of paper, the ‘push’ operation adds a sheet of paper to the top of the stack, the ‘top’ operation retrieves the sheet of paper which is currently at the top of and the ‘pop’ operation removes the sheet of paper which is currently at the top.

stackanalogy.png

To perform a depth-first-traversal using a stack in C++, we initially push the index of the starting node onto the stack. Then, we use a ‘while’ loop which runs as long as the stack contains some integers. Every time we run an iteration of this loop, we use the ‘top’ operation to retrieve the index at the top of the stack (clearly, this index would be the newest index which has been pushed) and then the ‘pop’ operation to remove this index from the top. Then, we loop through all of the connecting indices at this index of the adjacency list and push them onto the stack where appropriate. Note that we still should use an array of boolean variables here – we need to check this array before pushing any indices onto the stack in order to ensure that we do not revisit any previously-visited nodes. Since we are using this array, we will eventually reach a point in the algorithm where we stop pushing any new indices, and we will thus empty the stack and finish the traversal.

An implementation of a depth-first traversal using a stack can be found here.

A breadth-first traversal works in a different manner to a depth-first traversal – here, we check every adjacent node in turn before moving onto the next level of depth (hence why the technique is referred to as breadth-first).

BFSTraversal.png

An elegant way to program a breadth-first traversal in C++ is to use a queue – this data structure is vaguely similar to the stack structure in terms of usage, however, we can think of a queue structure as an actual queue of people in real life (as opposed to a stack of papers). Instead of having a ‘top’ operation, we have a ‘front’ operation (which retrieves the person at the front of the queue). Moreover, the ‘pop’ operation removes the person at the front of the queue. Pieces of data can thus be retrieved from a queue in the order they were added – therefore, we can use a ‘while’ loop to program a breadth-first traversal in the same way we use this loop to perform a depth-first traversal. We can begin by pushing the starting node index onto a queue and then pushing each of the nodes connecting to it onto the queue in turn (while still ensuring that we do not revisit previously-visited nodes by checking and updating an array of boolean variables) – this process can be repeated in the loop until there is nothing left to push onto the queue and we run out of nodes which need to be visited. Since the node indices are retrieved from the queue in the order they were pushed, all nodes adjacent to a given node are visited before the other nodes connected to each adjacent node are visited.

An implementation of a breadth-first traversal using a queue can be found here.

That concludes this post! In future posts, we will work through some graph-related problems on competitive programming websites, in addition to continuing the ‘Tetris for Android’ project.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s