Semantics

DepSpawn has a simple API and semantics for the parallelization of applications using tasks that the library automatically executes when their data dependencies have been fulfilled. Both of them are briefly described in this page. A more detailed discussion can be found in the publication A framework for argument-based task synchronization with automatic detection of dependencies (DOI 10.1016/j.parco.2013.04.012), which only lacks some of the new synchronization mechanisms added in version 2.0 and described in Interface.

Write your parallel tasks as functions

A first thing to take into account is that all the parallel tasks must be expressed as functions and they must communicate with the rest of the application only by means of their arguments. This means that these functions should have return type void (if this is not the case, DepSpawn is going to ignore the value returned, anyway!) and provide their outputs by means of their arguments. All the kinds of C++ functions are supported:

Regular C++ functions (e.g. void myf(int i, std::vector<char>& x) {...})
Member functions (those that belong to a struct or class)
Lambda functions (e.g. [](Widget &r, double f, const char *str) {...})
Functor objects (those that define operator())
std::function objects, defined in the standard header file <functional>
boost::function objects, defined in the Boost.Function library header file <boost/function.hpp>

Express dependencies by means of the parameter types

DepSpawn analyzes the types of the formal parameters of a function to identify whether they can be only read, or also modified, by the function. This, together with the actual arguments provided to the function when it is launched to execution, defines the data dependencies of the associated parallel task. Three situations are distinguished regarding the type of the formal parameters:

In C++ arguments can be passed by value or by reference. The changes made to the arguments passed by value can only be seen inside the function, as in fact they are copies of the original objects provided in the invocation. As a result these arguments are only inputs, and never outputs of the function.
Arguments passed by reference allow a function to access the original objects provided by the caller instead of copies. C++ references can be annotated as const or not.
- const references do not allow to modify the object received, so just as the arguments passed by value, they are only inputs to the function.
  Warning
  Programmers can circunvent this semantics and actually modify these arguments in their functions, for example by writing to mutable portions of the object or changing their type by means of const_cast. Since DepSpawn assumes that const references are not modified inside a function, in these situations it would be better to turn such parameters into non-const references so that it knows the truth about your evil intentions.
- non-const references can be both read and modified by the function, which turns them into inputs and outputs, being thus the natural mechanism to return results in DepSpawn.

It is very important to remember that DepSpawn only tracks the dependencies on the arguments provided to the tasks, and not on any other piece of data. This means that in the code

struct example_struct {
  float value;
  double *array;
};
 
void my_function(const example_struct &arg1, int n, double * const p) {
  arg1.array[n] = *p;
  *p = arg1.array[n+1];
}
 
int main() 
{ example_struct my_struct;
  double *my_ptr;
  int i;
  ...
  spawn(my_function, my_struct, i, my_ptr);
  ...
}

DepSpawn will take into account for the dependencies of my_function the memory occupied by my_struct, i and my_ptr, which are its arguments, but not the memory locations pointed by my_struct.array and ptr, which have in fact an extent that is totally unknown to DepSpawn. That is, for every argument x, the runtime tracks the memory region [&x, &x+sizeof(x)). As a result, the two assignments inside my_function are untracked, and thus unsafe with respect to other parallel tasks that can access those memory positions.

The fine print

Once we have expressed which are the inputs and the outputs of our tasks, DepSpawn makes sure that

Whenever a task has a read-only dependency on an argument, i.e., it is only an input to the task, the task is only scheduled for execution when all the preceding tasks that write to some memory position within that argument have finished.
When a task can write to an argument, i.e., it is a potential output for the task, the task is only scheduled for execution when all the preceding tasks that access some memory position within that argument, no matter it is for reading or for writing, have finished.

When an application does not have subtasks, that is, when all the tasks are spawned only by the main thread of execution, it is clear that the preceding tasks are those that the main thread spawned before the one being considered. A program has subtasks when the tasks spawned by the main thread also spawn tasks themselves, which also means that several threads can be spawning tasks. In this more general situation, two questions must be answered:

Which tasks should be considered as preceding tasks of a new one? As of now, DepSpawn considers as preceding tasks all the tasks spawned by any thread before the considered one.
When does a task with subtasks finish? As of now, DepSpawn considers that a task finishes only when all its subtasks finish.

Interface

All the components of the library are made available in the header file depspawn.h and they are encapsulated in the depspawn namespace. These are the main components:

The most important function is of course spawn(), which allows to request the parallel execution of any function f with any arbitrary list of arguments a, b, c, ... just with spawn(f, a, b, c, ... ).
wait_for_all() is probably the next most important function. As its name implies, it waits for all the spawned tasks to finish.
wait_for_subtasks() waits for all the subtasks spawned within the current task.
wait_for() waits for the specific set of objects provided as arguments to be ready for reading.
set_threads() is the preferred way to initialize the library, as it allows to specify both the number of threads to use and the stack size of each thread.
get_num_threads() retrieves the number of threads in use
set_task_queue_limit() can help optimize the performance if DepSpawn has been compiled setting the configuration variable DEPSPAWN_FAST_START on (see Installing DepSpawn). When this variable is active, whenever a thread spawns a new task, it checks how many ready tasks (i.e., already free of dependencies) are waiting to be executed. If this value is greater than a given threshold, the thread tries to execute one of these pending tasks before returning to its current task. By default the library sets this threshold to twice the number of threads in use specified using set_threads(). The user however is free to modify this threshold using set_task_queue_limit().
get_task_queue_limit() reports the task queue limit, as described above, or -1 if the library was compiled with DEPSPAWN_FAST_START off.
The Observer class provides objects which upon destruction wait for all the tasks spawned by their thread since their creation to finish. As a result, for example, the code
{ depspawn::Observer o;

spawn(f, a, b);

spawn(g, v[i], b);

spawn(h, a, q->value);

}

printf(...);

depspawn::Observer
When destroyed, it makes sure that all the tasks spawned by the current thread since its creation hav...
Definition: depspawn.h:693

makes sure that the tasks f, g and h have finished before the printf statement.
depspawn_sync(...) is a macro that executes the code between its parenthesis and waits for the tasks it has spawned to finish. It is therefore equivalent to
{ depspawn::Observer o; ... }
ignore() and Ignore are a function and a class, respectively, that help express situations in which we do not want to track dependencies on a given task argument. They are discussed in A general mechanism to avoid generating dependencies.

Example

The following example illustrates the use of this API to parallelize a program in which the variables i and j can be modified in two parallel incr tasks, while a third task add must wait for them to finish in order to use them to compute r.

#include <iostream>
#include "depspawn/depspawn.h"
 
using namespace depspawn;
 
void incr(int &i) {
  i++;
}
 
void add(int &r, int a, int b) {
  r = a + b;
}
 
int main()
{
  int i = 0, j = 8, r;
  
  set_threads();
 
  spawn(incr, i);
  spawn(incr, j);
  spawn(add, r, i, j);
  
  wait_for_all();
 
  std::cout << "Result=" << r << std::endl;
 
  bool test_ok = (r == 10);
  
  std::cout << "TEST " << (test_ok ? "SUCCESSFUL" : "UNSUCCESSFUL") << std::endl;
  
  return !test_ok;
}

Table of Contents