Variance is average of the square deviation.
We assume that input list is [a1, a2, ..., an],
mathematical definition is as follows:
By definitions (see reference manual) of the skeletons provided in the library, a program written in Haskell which calculates variance is as follows:
var as = sqSum / n where sum = reduce (+) as ave = sum / n sqSum = reduce (+) (map square (map (-ave) as))
Thus, we can get a parallel program by just translating it into C++ code with the Library.
C++ code for Variance is listed below:
dist_list<double> *as = new dist_list<double>(gen, SIZE); double ave = list_skeletons::reduce(add, add_unit, as) / SIZE; Sub sub(ave); list_skeletons::map_ow(sub, as); list_skeletons::map_ow(sqr, as); double var = list_skeletons::reduce(add, add_unit, as) / SIZE;
This code is a part of a source file "variance.cpp" in samples directory. An explanation for the code is as follows.
dist_list<double> *as = new dist_list<double>(gen, SIZE);
double ave = list_skeletons::reduce(add, add_unit, as) / SIZE;
Sub sub(ave); list_skeletons::map_ow(sub, as); list_skeletons::map_ow(sqr, as);
double var = list_skeletons::reduce(add, add_unit, as) / SIZE;
Each function object must inherit base function object class. An example is as follows:
struct Gen : public unary_function<int, double> { double operator()(int index) const { return static_cast<double>(index); } } gen;
"Gen" inherits "unary_function", since it is one argument function object. "unary_function" is template class, we must give it type of argument and type of return value. "operator()" needs "const" qualifier, because we want to keep referential transparency of function object. "binary_function" is also similar.
A complete code with a sample main function is listed below ("samples/list/variance.cpp"):
#include <iostream> #include "list_skeletons.h" using namespace std; const int SIZE = 1000; struct Gen : public unary_function<int, double> { double operator()(int index) const { return static_cast<double>(index); } } gen; struct Add : public binary_function<double, double, double> { double operator()(double x, double y) const { return x + y; } } add; const double add_unit = 0.0; struct Sqr : public unary_function<double, double> { double operator()(double x) const { return x * x; } } sqr; struct Sub : public unary_function<double, double> { double val; Sub(double val_) : val(val_){ } double operator()(double x) const { return x - val; } }; int SketoMain(int argc, char **argv) { dist_list<double> *as = new dist_list<double>(gen, SIZE); double ave = list_skeletons::reduce(add, add_unit, as) / SIZE; Sub sub(ave); list_skeletons::map_ow(sub, as); list_skeletons::map_ow(sqr, as); double var = list_skeletons::reduce(add, add_unit, as) / SIZE; if(skeleton::rank == 0){ cout << "average:" << ave << "\n" << "variance:" << var << "\n"; } return 0; }
The program will begin at the function "SketoMain" instead of the "main" function(the ordinary entry point). All the programs using our skeleton library must have "SketoMain" functions as their entry points.
Compilation is simply done by the following command:
(You should specify a suitable path to the skeleton library in path_to_skeleton_library
and change the current directory to 'samples/list/')
> mpiCC -Wall -O3 -o variance variance.cpp
NOTE:The name of C++ compiler with MPI "mpiCC" may differ in some environments.
If you don't have MPI's C++ compiler "mpiCC" (or C++ bindings for MPI), you may compile the program by passing some extra arguments to an ordinary C++ compiler such as g++:
> g++ -Wall -O3 -Ipath_to_skeleton_library -Ipath_to_mpi_include_files -Lpath_to_mpi_libraries -o variance variance.cpp -lmpich
NOTE:You can also compile the source code by just using make (GNU make) command if you have successfully generated 'base makefile' during the installation.
> make -f ../../makefile.base variance
Other samples and tests in the directory 'samples/list/' can be compiled in the same way (or just type 'make' in the directory).
Then, we can execute the program by the following command:
> mpirun -np n variance
Here n is the number of processors involved in the execution.
NOTE:The executer of MPI programs "mpirun" may differ in some environments.