High Level Functions#
We provide various high-level free functions to implement generic operations. Typically these operations are not captured directly by the looping abstractions and attempts to implement these operations without writing a SYCL implementation may result in subpar performance.
CellDatConst Arithmetic#
The cell_dat_const_loop_element_wise applies a given scalar function elementwise to the set of CellDatConst instances provided as arguments and assigns the outputs to the output CellDatConst.
The output may be one of the arguments.
In the case where an argument has one row and column per cell this single value per cell is broadcast to all elements of the matrix which is the size of the output matrix.
void cell_dat_const_loop_element_wise_example(SYCLTargetSharedPtr sycl_target){
// Create CellDatConsts of the same size and shape.
const int cell_count = 61;
int nrow = 3;
int ncol = 7;
auto a =
std::make_shared<CellDatConst<REAL>>(sycl_target, cell_count, nrow, ncol);
auto b =
std::make_shared<CellDatConst<REAL>>(sycl_target, cell_count, nrow, ncol);
auto c =
std::make_shared<CellDatConst<REAL>>(sycl_target, cell_count, nrow, ncol);
auto d =
std::make_shared<CellDatConst<REAL>>(sycl_target, cell_count, nrow, ncol);
// Create some initial data for the arguments a,b and c.
std::mt19937 rng(522342 + sycl_target->comm_pair.rank_parent);
std::uniform_real_distribution<REAL> dist(1.0, 4.0);
auto h_a = a->get_all_cells();
auto h_b = b->get_all_cells();
auto h_c = c->get_all_cells();
for (int cellx = 0; cellx < cell_count; cellx++) {
for (int colx = 0; colx < ncol; colx++) {
for (int rowx = 0; rowx < nrow; rowx++) {
const auto ta = dist(rng);
const auto tb = dist(rng);
const auto tc = dist(rng);
h_a.at(cellx)->at(rowx, colx) = ta;
h_b.at(cellx)->at(rowx, colx) = tb;
h_c.at(cellx)->at(rowx, colx) = tc;
}
}
}
a->set_all_cells(h_a);
b->set_all_cells(h_b);
c->set_all_cells(h_c);
d->fill(0);
// d[cell, row, col] =
// a[cell, row, col] * b[cell, row, col] + c[cell, row, col]
//
// Note that this is a scalar valued function of scalars that is applied
// element wise. This function should be device copyable and executable
// on the compute device.
cell_dat_const_loop_element_wise(
d, [=](REAL a, REAL b, REAL c) -> REAL { return a * b + c; }, a, b, c);
}
Cellwise Broadcast#
Set the specified component and property on all particles to the value in the passed array at the index that corresponds to the cell of the particle.
inline void cellwise_broadcast(
ParticleGroupSharedPtr particle_group
) {
// Get the number of cells on this MPI rank.
const int cell_count = particle_group->domain->mesh->get_cell_count();
std::vector<INT> h_cell_values(cell_count);
for(int ix=0 ; ix<cell_count ; ix++){
h_cell_values.at(ix) = ix + 1;
}
// All particles in cell i will have ID.at(0) set to i + 1.
cellwise_broadcast(particle_group, Sym<INT>("ID"), 0, h_cell_values);
// The cellwise_broadcast function can also be called with a ParticleSubGroup.
}