For the first example of our tutorials, we consider computing the height of a binary tree. We will solve this problem by composing three parallel tree skeletons, map, downwards accumulate, and reduce.
A sample program for this tutorial is included in the library (samples/tree/height_bin.cpp).
The height of the tree is the maximum of depths for all nodes, where the depth of a node is computed as the length of the path from the root to the node. Let the type of binary trees be defined as follows.
data tree a = Leaf a | Node a (tree a) (tree a)
Based on the definition above, we can compute the height of a tree in the following three steps. Let tree1 be the input tree.
Assign 1 for each node, as the preparation of the next step. We can perform this with map skeleton.
tree2 = map one one tree1 where one x = 1
For each node, compute the depth by a top-down traversal over the tree. We can perform this with downwards accumulate skeleton, with plus operator and the initial value 0.
tree3 = dAcc (+) id id 0 tree2 where id x = x
Finally, compute the maximum among all the leaves with a bottom-up traversal over the tree. (We omit the comparison of internal nodes, since the children of them should has a larger depth. ) We can perform this with reduce skeleton with suitable functions as follows.
height = reduce id max_lr tree3 where id x = x max_lr n l r = l + r
Note: In fact, we can solve this problem only with a reduce skeleton, but deriving a parallel code for such reduce is a little difficult. Therefore, we take the solution above for which the derivation of parallel program is rather easy.
Since several skeletons require some conditions on their argument functions for efficient implementations, we verify such conditions and derive suitable auxiliary functions if necessary.
In downwards accumulate skeleton, the binary operator should be associative. In the case of running example, the binary operator is plus and this satisfy the condition of downwards accumulate.
For reduce skeleton, we are imposed more complicated conditions and required to derive auxiliary functions, since the skeleton is implemented based on the tree contraction algorithms. As mentioned in the reference manual, reduce skeleton requires four auxiliary functions, psi, phiL, phiR, G, which satisfy the following equations for any values.
max_lr n l r = G (psi n) l r max_lr n l (max_lr rn rl rr) = G (phiL (psi n) l (psi rn)) rl rr max_lr n (max_lr ln ll lr) r = G (phiR (psi n) r (psi ln)) ll lr
With simple intuition and calculation, we can find the following solution for them.
psi n = -infty phiL p l r = l phiR p r l = r G p l r = p `max` (l `max` r)
In the last section, we show that the functions used in skeletons satisfy the conditions and derived auxiliary functions. Now we write a program using tree skeletons.
To utilize our tree skeleton library, the program should obey the following rules. (The first two rules are common through the SkeTo library. )
The program must contains a "SketoMain( int, char** )" function, not the "main" function. The program will begin at this function.
In the sample program, the function is defined from line 83 to line 123. Users need not to call MPI functions.
The functions used in the skeleton should be function objects that inherit either of "unary_function", "binary_function" and "ternary_function", with appropriate template instantiation.
For example, the function id for int can be defined as follows.
struct func_id : unary_function< int, int > { int operator()( const int& x ) const { return x; } };
The skeletons uses the tree distributed over the processors. Users cat distribute the tree using "tree_skeletons::distribute" function, or reading data from the file obtained by "tree_util::dist_and_write_to_file" function. The format of the file is shown later.
You can compile the program with our tree skeletons by specifying paths "libraries/tree" and "libraries/util" for the include path adding to the MPI headers' directory and MPI library file.
If you have the c++ binding of the MPI, you may compile the program as follows.
mpiCC -I$(SKETO_DIR)/tree -I$(SKETO_DIR)/util -c -o height_bin.o height_bin.cpp mpiCC -o height_bin height_bin.o
If you don't have the c++ binding of the MPI, you may compile the program as follows.
g++ -I$(MPI_DIR)/include -I$(SKETO_DIR)/tree -I$(SKETO_DIR)/util -c -o height_bin.o height_bin.cpp g++ -o height_bin height_bin.o -L$(MPI_DIR)/lib -lmpich
Note: These compile commands may be different in your environment.
After configuring the makefile.base in the root directory of this library, using this variables may save you from specifying these directory. Please refer "Makefile" in the sample directory for more details. .
You can execute the program by "mpirun" command. Note that you must specify more processors than the number of distribution of the tree.
A sample of execution is as follows
mpirun -np 5 height_bin dist_tree_char.txt
Note: The commands may be different on your environment.
The tree distribution is computed by the m-bridge technique. Based on this tree division schema, the tree may be divided into several segments, where at most one cut-node exists. A cut-node is the node at which, the child segments connects to the segment.
A sample distributed binary tree is included in our library. ("samples/tree/dist_tree_char.txt")
The first line shows the number of segments p. The following p lines specify the global structure of the segments in a depth-first traversal from left to right: in each line, the first integer specify that the segment has children (=0) or not (=1), and the second integer specify the number of nodes in the segment.
After these, the information of the nodes is given as one line for each node. The node is specified in the order of depth-first traversal for each segment. The first integer specify the type of node: 0 means it has two children in the segment, 1 means it has no children, and 2 means it has two children but they are in other segments. After that, the value of the node appears.