Even Tree [HackerRank]

Hi all, I’m sorry about not posting very frequently – thankfully, however, my exams will be coming to a close in the next two weeks, so I will be able to return to regular posting soon.

If you are unfamiliar with graph theory, I recommend that you read my previous post on the subject – it gives an introduction into the topic and the ways in which it appears in competitive programming.

Today, we will solve a relatively simple problem on graph theory which can be found on the ‘HackerRank’ website – given a tree (a graph with no cycles and where any node can be reached from any other node) of N nodes numbered from 1 (which is the root node) to N, our program must find the maximum number of edges it can remove from this tree to obtain a set of trees (commonly called a ‘forest’) where every tree has an even number of nodes.

The source code of my final solution to this problem can be found here.

The problem statement states that it is always possible to remove edges from the main tree to create a forest consisting of only trees with even numbers of nodes, so we do not need to concern ourselves with difficult edge cases. This brings us to a relatively simple solution: any ‘subtree’ (a smaller tree which is part of the main tree – it may have its own root separate from the main root) with an even number of nodes can be separated from the node connecting to its root – therefore, the maximum number of edges which can be removed from the tree to satisfy the problem statement is equal to the number of subtrees with even numbers of nodes. We have now reduced the problem to counting the number of nodes in each subtree.

Our first course of action is to devise a method for storing the tree – the technique I used is known as an ‘adjacency list’. In C++, this is essentially a vector where each element represents a node. For simple graphs like the ones used in this problem, each node can be represented by a vector of integers, with each integer representing a node which can be reached from the node represented by the vector itself. Initially our adjacency list will be empty, however, we will fill it by taking each edge from input as two integers (which represent the two nodes connected by the edge in question) connecting the deeper of the two nodes to the node closer to the root (not vice-versa! The edges in our list only need to be unidirectional).

EvenTree1.png

After creating the adjacency list, we can write a function which performs a depth-first traversal of the tree, counting the number of nodes in each subtree by travelling to each node in turn and counting the number of children it has and adding 1 to this value (this addition represents the current node, which acts as the root). We can do this elegantly using recursion – the numbers of nodes in the subtrees rooted at the deepest points of the tree are always 1 (as these nodes have no child nodes). By this logic, we can have our recurring function return 1 when it reaches a node with no child nodes, and in order to calculate the numbers of nodes in the other subtrees, we recur for every child node of each potential root node and save the sum of the results.

EvenTree2.png

Once we have calculated the number of nodes in every possible subtree of the main tree, we can just iterate through our results and increment our final answer by 1 every time we find a subtree with an even number of nodes. Once this process is complete, we can print out our final value.

EvenTree3.png

Graph Theory – Breadth-first Traversal and Depth-first Traversal

Hello all – given that myself and many other students worldwide are currently in the midst of the exam season, I haven’t been very active on this blog. I hope that you all will bear with me for the time being – when my exams are complete, I will be able to start making blog posts more frequently again.

As the title suggests, this post will be about graph theory, a very interesting and useful aspect of mathematics and computer science which is also encountered frequently in competitive programming. The graphs which we will encounter in this post are not the graphs one may associate with high school statistics! Instead, these graphs consist of a series of ‘nodes’ (sometimes called ‘vertices’), connected by ‘edges’.

graph

These graphs are commonly used to represent a set of points in space, and the connections between them. For example, a problem encountered in a programming contest may involve a set of towns connected by roads – these towns can be represented by graph nodes, and the roads can be represented by the edges in between the nodes. Edges in a graph can be both ‘directed’ and ‘undirected’. To illustrate this concept, we can think of two nodes x and y – if the edge between them is undirected, x can be reached from and y and be reached from x. On the other hand, if the edge is directed, only one of the two nodes can be reached from the other node (if we refer back to the analogy with towns and roads, a directed edge could represent a one-way street).

There are various ways of representing graphs in code, however, we will focus on using an ‘adjacency list’. This is a method of storing a graph by representing each node as a set of edges to other nodes – for example, in C++, an adjacency list could be a vector of vectors of integers, where each vector of integers holds the indices of all of the nodes connected to the node represented by the vector in question. Moreover, if the edges of the graph are undirected, the vectors representing two connected nodes would contain the indices of each other (that is, if the two nodes were x and ythe vector representing x would contain the index of the vector representing y, and vice versa). In the programs used in this post, we will construct our graph by first taking the number of nodes as input, and then taking the number of edges. For each edge, we take two integers – these integers represent the indices of the two nodes connected by an edge.

inputgraph.png

Many problems in programming contests require graphs to be ‘traversed’ through – this means that the problem’s answer will need to be obtained by moving through a generated graph. Two methods of traversing through graphs will be discussed in this post: depth-first traversal and breadth-first traversal. When these methods are applied in a program, the nodes of a graph are ‘visited’ in a certain sequence determined by the traversal technique. When we say that we have visited a node, we mean that we have obtained its index. An undirected graph will be used in our demonstrations – we will move through this graph and print the index of each node as we visit them.

The depth-first traversal technique moves through a graph in the manner suggested by the name – it goes ‘deep’ into the graph before looking at adjacent routes. The image below shows every node in a graph being traversed using a depth-first traversal starting at the node with an index of 0 (note that we could have started the traversal at any node).

DFSTraversal.png

A simple way of implementing a depth-first traversal is through recursion – if we represent the graph as a vector of vectors of integers, we first call a recurring function and pass it the index of the starting node as an argument. Then, we recur for every integer in the vector indexed by the function’s argument (since each integer in this vector represents the index of a node connected to the current node). In order to prevent our traversal algorithm from continuously revisiting the same sets of nodes (this could potentially cause a stack overflow), we can use an array of boolean variables which represents the ‘state’ of each node (that is, whether each node has been visited or not). If the boolean at index i of this array is set to true, then the node at index i of the adjacency list has been visited. Every time we are about to recur, we check to ensure that the index we are about to recur with has not already been visited. Furthermore, every time we recur, we mark the newly-visited node as visited by changing its corresponding boolean value in the array.

An implementation of a depth-first traversal using recursion can be found here.

Alternatively, we can also implement a depth-first traversal in C++ using a stack of integers (these integers represent the indices of the graph nodes) – the main operations we are concerned with here are the ‘push’, ‘top’ and ‘pop’ operations. A stack is a data structure which is available in the C++ – it can be used to store and retrieve variables in the same manner objects can be stored and retrieved from a stack in real life. If we imagine our stack as a stack of sheets of paper, the ‘push’ operation adds a sheet of paper to the top of the stack, the ‘top’ operation retrieves the sheet of paper which is currently at the top of and the ‘pop’ operation removes the sheet of paper which is currently at the top.

stackanalogy.png

To perform a depth-first-traversal using a stack in C++, we initially push the index of the starting node onto the stack. Then, we use a ‘while’ loop which runs as long as the stack contains some integers. Every time we run an iteration of this loop, we use the ‘top’ operation to retrieve the index at the top of the stack (clearly, this index would be the newest index which has been pushed) and then the ‘pop’ operation to remove this index from the top. Then, we loop through all of the connecting indices at this index of the adjacency list and push them onto the stack where appropriate. Note that we still should use an array of boolean variables here – we need to check this array before pushing any indices onto the stack in order to ensure that we do not revisit any previously-visited nodes. Since we are using this array, we will eventually reach a point in the algorithm where we stop pushing any new indices, and we will thus empty the stack and finish the traversal.

An implementation of a depth-first traversal using a stack can be found here.

A breadth-first traversal works in a different manner to a depth-first traversal – here, we check every adjacent node in turn before moving onto the next level of depth (hence why the technique is referred to as breadth-first).

BFSTraversal.png

An elegant way to program a breadth-first traversal in C++ is to use a queue – this data structure is vaguely similar to the stack structure in terms of usage, however, we can think of a queue structure as an actual queue of people in real life (as opposed to a stack of papers). Instead of having a ‘top’ operation, we have a ‘front’ operation (which retrieves the person at the front of the queue). Moreover, the ‘pop’ operation removes the person at the front of the queue. Pieces of data can thus be retrieved from a queue in the order they were added – therefore, we can use a ‘while’ loop to program a breadth-first traversal in the same way we use this loop to perform a depth-first traversal. We can begin by pushing the starting node index onto a queue and then pushing each of the nodes connecting to it onto the queue in turn (while still ensuring that we do not revisit previously-visited nodes by checking and updating an array of boolean variables) – this process can be repeated in the loop until there is nothing left to push onto the queue and we run out of nodes which need to be visited. Since the node indices are retrieved from the queue in the order they were pushed, all nodes adjacent to a given node are visited before the other nodes connected to each adjacent node are visited.

An implementation of a breadth-first traversal using a queue can be found here.

That concludes this post! In future posts, we will work through some graph-related problems on competitive programming websites, in addition to continuing the ‘Tetris for Android’ project.