Maximum Upper-Left Submatrix Sum Problem

Definition of The Problem

This problem is an easy version of well-known Maximum Submatrix Problem. Its goal is to find the maximum sum among the sums of upper-left submatrices of given matrix.

By definitions (see reference manual) of the skeletons provided in the library, a program written in Haskell which calculates the maximum upper-left submatrix sum of a matrix is as follows:

  fnorm = reduce(max, max) . scan ((+), (+))

Thus, we can get a parallel program by just translating it into C++ code with the Library.

Programming in C++ with Library

C++ code for MULSSP is listed below:

double mulssp(const matrix<double> &mat){
    dist_matrix<double> dmat(&mat);
    dist_matrix<double> *dmat2 = matrix_skeletons::scan(Add<double>(), Add<double>(), &dmat);
    double *ss = matrix_skeletons::reduce(Max<double>(), Max<double>(), dmat2);
    double ret = *ss;
    delete ss;
    delete dmat2;
    return ret;
}

This code is a part of a source file "mulssp.cpp" in samples directory.
An explanation for the code is as follows.

  1. First, we distribute the matrix mat among processors by the constructor of dist_matrix class:

        dist_matrix<double> dmat(&mat);
    
  2. Then, applying the scan skeleton to the distributed matrix dmat with two function objects, we can get another distributed matrix *dmat2 as the intermediate data. An element of *dmat2 is a sum of the elements of a corresponding submatrix of the original matrix. The scan skeleton is provided as a static member function of matrix_skeletons class. Add<double>() is an addition operator on double and provided in primitive_functions.h. In this case, the return value of the skeleton scan is a pointer of a object of dist_matrix (an distributed matrix object).

        dist_matrix<double> *dmat2 = matrix_skeletons::scan(Add<double>(), Add<double>(), &dmat);
    

    NOTE: Function objects passed to the skeletons should inherit suitable base classes of function objects such as skeleton::binary_function defined in "/library/util/functions.h"

  3. Applying the reduce skeleton to the intermediate data *dmat2, we can get the Maximum Upper-Left Submatrix Sum of the matrix mat. Max<double>() is a function object of a max operator on double.

        double *ss = matrix_skeletons::reduce(Max<double>(), Max<double>(), dmat2);
    
  4. Finally, we free the pointers and return the result.

        double ret = *ss;
        delete ss;
        delete dmat2;
        return ret;
    

Compilation and Execution

A complete code with a sample main function is listed below("samples/matrix/mulssp.cpp"):

#include <iostream>
#include <cstdlib>
#include "matrix_skeletons.h"
#include "primitive_functions.h"
using namespace primitive_functions;

double mulssp(const matrix<double> &mat){
    dist_matrix<double> dmat(&mat);
    dist_matrix<double> *dmat2 = matrix_skeletons::scan(Add<double>(), Add<double>(), &dmat);
    double *ss = matrix_skeletons::reduce(Max<double>(), Max<double>(), dmat2);
    double ret = *ss;
    delete ss;
    delete dmat2;
    return ret;
}

int SketoMain( int argc, char **argv )
{

    int n = 1000;
    if(argc > 1){
        n = atoi(argv[1]);
    }
    matrix<double> mat(n, n);
    mat.generate(GenSub<double>());
    double mulss = mulssp(mat);
    if(skeleton::rank==0){
        std::cout << "#procs = "<< skeleton::procs << endl;
        std::cout << "matrix size = " << n << "x" << n << endl;
        std::cout << "Maximum Upper-Left Submatrix Sum = " << mulss << std::endl;
        std::cout << "the matrix = " << endl;
        std::cout<< mat<<endl;
    }
    
    return 0;
}

This code includes "matrix_skeletons.h" for matrix skeletons and "primitive_functions.h" for function objects. The program will begin at the function "SketoMain" instead of the "main" function(the ordinary entry point). All the programs using our skeleton library must have "SketoMain" functions as their entry points.

Compilation

Compilation is simply done by the following command:
(You should specify a suitable path to the skeleton library in path_to_skeleton_library and change the current directory to 'samples/matrix/')

> mpiCC -Wall -O2 -Ipath_to_skeleton_library -o mulssp mulssp.cpp 

NOTE:The name of C++ compiler with MPI "mpiCC" may differ in some environments.

If you don't have MPI's C++ compiler "mpiCC" (or C++ bindings for MPI), you may compile the program by passing some extra arguments to an ordinary C++ compiler such as g++:

> g++ -Wall -O2 -Ipath_to_skeleton_library -Ipath_to_mpi_include_files -Lpath_to_mpi_libraries -o mulssp mulssp.cpp -Lpath_to_skeleton_library_util -lsketo -lmpich

NOTE:You can also compile the source code by just using make (GNU make) command if you have successfully generated 'base makefile' during the installation.

> make -f ../../makefile.base mulssp

Other samples and tests in the directory 'samples/matrix/' can be compiled in the same way (or just type 'make' in the directory).

Execution

Then, we can execute the program by the following command:

> mpirun -np n mulssp

Here n is the number of processors involved in the execution.
NOTE:The executer of MPI programs "mpirun" may differ in some environments.

The following image is a graph of speedups vs. the number of processors. (A speedup is the ratio of execution time of one-processor and n-processors.)
Speedups vs #Processors
Good effectiveness of parallelizing is shown in the above figure because the speedup is linear with respect to the number of processors (the size of matrices is (3000 x 3000) for the red line, (4000 x 4000) for the green line (super-linear)).

Back to Index