DepSpawn has a simple API and semantics for the parallelization of applications using tasks that the library automatically executes when their data dependencies have been fulfilled. Both of them are briefly described in this page. A more detailed discussion can be found in the publication A framework for argument-based task synchronization with automatic detection of dependencies (DOI 10.1016/j.parco.2013.04.012), which only lacks some of the new synchronization mechanisms added in version 2.0 and described in Interface.
A first thing to take into account is that all the parallel tasks must be expressed as functions and they must communicate with the rest of the application only by means of their arguments. This means that these functions should have return type void
(if this is not the case, DepSpawn is going to ignore the value returned, anyway!) and provide their outputs by means of their arguments. All the kinds of C++ functions are supported:
void myf(int i, std::vector<char>& x) {...}
)struct
or class
)[](Widget &r, double f, const char *str) {...}
)operator()
)std::function
objects, defined in the standard header file <functional>
boost::function
objects, defined in the Boost.Function library header file <boost/function.hpp>
DepSpawn analyzes the types of the formal parameters of a function to identify whether they can be only read, or also modified, by the function. This, together with the actual arguments provided to the function when it is launched to execution, defines the data dependencies of the associated parallel task. Three situations are distinguished regarding the type of the formal parameters:
const
or not.const
references do not allow to modify the object received, so just as the arguments passed by value, they are only inputs to the function. mutable
portions of the object or changing their type by means of const_cast
. Since DepSpawn assumes that const
references are not modified inside a function, in these situations it would be better to turn such parameters into non-const
references so that it knows the truth about your evil intentions.const
references can be both read and modified by the function, which turns them into inputs and outputs, being thus the natural mechanism to return results in DepSpawn.It is very important to remember that DepSpawn only tracks the dependencies on the arguments provided to the tasks, and not on any other piece of data. This means that in the code
DepSpawn will take into account for the dependencies of my_function
the memory occupied by my_struct
, i
and my_ptr
, which are its arguments, but not the memory locations pointed by my_struct.array
and ptr
, which have in fact an extent that is totally unknown to DepSpawn. That is, for every argument x
, the runtime tracks the memory region [&x
, &x
+sizeof(x)
). As a result, the two assignments inside my_function
are untracked, and thus unsafe with respect to other parallel tasks that can access those memory positions.
Once we have expressed which are the inputs and the outputs of our tasks, DepSpawn makes sure that
When an application does not have subtasks, that is, when all the tasks are spawned only by the main thread of execution, it is clear that the preceding tasks are those that the main thread spawned before the one being considered. A program has subtasks when the tasks spawned by the main thread also spawn tasks themselves, which also means that several threads can be spawning tasks. In this more general situation, two questions must be answered:
All the components of the library are made available in the header file depspawn.h
and they are encapsulated in the depspawn
namespace. These are the main components:
f
with any arbitrary list of arguments a
, b
, c
, ... just with spawn(f, a, b, c, ... )
.wait_for_all()
is probably the next most important function. As its name implies, it waits for all the spawned tasks to finish.wait_for_subtasks()
waits for all the subtasks spawned within the current task.wait_for()
waits for the specific set of objects provided as arguments to be ready for reading.set_threads()
is the preferred way to initialize the library, as it allows to specify both the number of threads to use and the stack size of each thread.get_num_threads()
retrieves the number of threads in useset_task_queue_limit()
can help optimize the performance if DepSpawn has been compiled setting the configuration variable DEPSPAWN_FAST_START
on (see Installing DepSpawn). When this variable is active, whenever a thread spawns a new task, it checks how many ready tasks (i.e., already free of dependencies) are waiting to be executed. If this value is greater than a given threshold, the thread tries to execute one of these pending tasks before returning to its current task. By default the library sets this threshold to twice the number of threads in use specified using set_threads()
. The user however is free to modify this threshold using set_task_queue_limit()
.get_task_queue_limit()
reports the task queue limit, as described above, or -1 if the library was compiled with DEPSPAWN_FAST_START
off.Observer
class provides objects which upon destruction wait for all the tasks spawned by their thread since their creation to finish. As a result, for example, the code f
, g
and h
have finished before the printf
statement.depspawn_sync(...)
is a macro that executes the code between its parenthesis and waits for the tasks it has spawned to finish. It is therefore equivalent to ignore()
and Ignore
are a function and a class, respectively, that help express situations in which we do not want to track dependencies on a given task argument. They are discussed in A general mechanism to avoid generating dependencies.The following example illustrates the use of this API to parallelize a program in which the variables i
and j
can be modified in two parallel incr
tasks, while a third task add
must wait for them to finish in order to use them to compute r
.