Grappa  r3821, hash 22cd626d567a91ead5b23302066d1e9469f45c66
Collectives

Functions

template<typename F >
void Grappa::call_on_all_cores (F work)
 Call message (work that cannot block) on all cores, block until ack received from all. More...
 
template<typename F >
void Grappa::on_all_cores (F work)
 Spawn a private task on each core, block until all complete. More...
 
template<typename T , T(*)(const T &, const T &) ReduceOp>
Grappa::allreduce (T myval)
 Called from SPMD context, reduces values from all cores calling allreduce and returns reduced values to everyone. More...
 
template<typename T , T(*)(const T &, const T &) ReduceOp>
void Grappa::allreduce_inplace (T *array, size_t nelem=1)
 Called from SPMD context. More...
 
template<typename T , T(*)(const T &, const T &) ReduceOp>
Grappa::reduce (const T *global_ptr)
 Called from a single task (usually user_main), reduces values from all cores onto the calling node. More...
 
template<typename T , T(*)(const T &, const T &) ReduceOp>
Grappa::reduce (GlobalAddress< T > localizable)
 Reduce over a symmetrically allocated object. More...
 
template<typename T , typename P , T(*)(const T &, const T &) ReduceOp, T(*)(GlobalAddress< P >) Accessor>
Grappa::reduce (GlobalAddress< P > localizable)
 Reduce over a member of a symmetrically allocated object. More...
 
template<typename F = nullptr_t>
auto Grappa::sum_all_cores (F func) -> decltype(func())
 Custom reduction from all cores. More...
 

Detailed Description

Function Documentation

template<typename T , T(*)(const T &, const T &) ReduceOp>
T Grappa::allreduce ( myval)

Called from SPMD context, reduces values from all cores calling allreduce and returns reduced values to everyone.

Blocks until reduction is complete, so suffices as a global barrier.

Warning
May only one with a given type/op combination may be used at a time, uses a function-private static variable.

Example:

int value = foo();
int total = Grappa::allreduce<int,collective_add>(value);
});

Definition at line 334 of file Collective.hpp.

template<typename T , T(*)(const T &, const T &) ReduceOp>
void Grappa::allreduce_inplace ( T *  array,
size_t  nelem = 1 
)

Called from SPMD context.

Do an in-place allreduce (works on arrays). All elements of the array will be overwritten by the operation with the total from all cores.

Warning
May only one with a given type/op combination may be used at a time, uses a function-private static variable.

Definition at line 351 of file Collective.hpp.

template<typename F >
void Grappa::call_on_all_cores ( work)

Call message (work that cannot block) on all cores, block until ack received from all.

Like Grappa::on_all_cores() but does not spawn tasks on each core. Can safely be called concurrently with others.

Definition at line 157 of file Collective.hpp.

template<typename F >
void Grappa::on_all_cores ( work)

Spawn a private task on each core, block until all complete.

To be used for any SPMD-style work (e.g. initializing globals). Also used as a primitive in Grappa system code where anything is done on all cores.

Example:

Definition at line 186 of file Collective.hpp.

template<typename T , T(*)(const T &, const T &) ReduceOp>
T Grappa::reduce ( const T *  global_ptr)

Called from a single task (usually user_main), reduces values from all cores onto the calling node.

Blocks until reduction is complete. Safe to use any number of these concurrently.

Example:

static int x;
void user_main() {
on_all_cores([]{ x = foo(); });
int total = reduce<int,collective_add>(&x);
}

Definition at line 369 of file Collective.hpp.

template<typename T , T(*)(const T &, const T &) ReduceOp>
T Grappa::reduce ( GlobalAddress< T >  localizable)

Reduce over a symmetrically allocated object.

Blocks until reduction is complete. Safe to use any number of these concurrently. Example:

void user_main() {
auto x = Grappa::symmetric_global_alloc<BlockAlignedInt>();
on_all_cores([]{ x = foo(); });
int total = reduce<int,collective_add>(x);
}

Definition at line 405 of file Collective.hpp.

template<typename T , typename P , T(*)(const T &, const T &) ReduceOp, T(*)(GlobalAddress< P >) Accessor>
T Grappa::reduce ( GlobalAddress< P >  localizable)

Reduce over a member of a symmetrically allocated object.

The Accessor function is used to pull out the member. Blocks until reduction is complete. Safe to use any number of these concurrently.

Example:

struct BlockAlignedObj {
int x;
return o->x;
}
void user_main() {
auto x = Grappa::symmetric_global_alloc<BlockAlignedObj>();
on_all_cores([]{ x = foo(); });
int total = reduce<int,BlockedAlignedObj,collective_add,&getX>(x);
}

Definition at line 446 of file Collective.hpp.

template<typename F = nullptr_t>
auto Grappa::sum_all_cores ( func) -> decltype(func())

Custom reduction from all cores.

Takes a lambda to run on each core, returns the sum of all the results to the caller. This is often easier than using the "custom Accessor" version of reduce, and also works on symmetric addresses.

Basically, reduce() could be implemented as:

  • ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ int global_x;

    // (in main task) int total = sum_all_cores([]{ return global_x; });

    // is equivalent to: int total = reduce<collective_add>(&global_x);

  • ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Definition at line 484 of file Collective.hpp.