Profiling#
Built-in Profiling#
Internally NP can record the time taken for expensive operations and internal ParticleLoops. The time taken for user written loops can also be recorded. For these recorded times to make much sense it is beneficial to given ParticleLoops a meaningful name. Furthermore code regions, which do not have to be ParticleLoops, can also be added to the set of recorded regions by using the ProfileRegion class.
Warning
Profiling of regions is disabled unless NESO_PARTICLES_PROFILING_REGION is defined.
To enable built in profiling of regions the preprocessor variable NESO_PARTICLES_PROFILING_REGION must be defined. If this variable is not defined then the profiling functions and methods become no-ops. This variable can be defined by compiler flags or defined before including NP as follows.
#define NESO_PARTICLES_PROFILING_REGION
#include <neso_particles.hpp>
The ProfileRegion class records the time taken for a specific piece of code. These regions are recorded within a ProfileMap instance. There is a ProfileMap member within the SYCLTarget type which records the regions for ParticleLoops which execute on that compute device.
In the following code snippet we demonstrate how to profile ParticleLoops and user defined regions. Finally this profiling data is written to a JSON file by each rank. The write operation is not collective and users may choose to only write the data from specific ranks.
/*
NESO_PARTICLES_PROFILING_REGION should be defined before NESO-Particles is
included.
#define NESO_PARTICLES_PROFILING_REGION
#include <neso_particles.hpp>
*/
inline void profile_regions_example(
// Input ParticleGroup - we will loop over all particles in this
// ParticleGroup.
ParticleGroupSharedPtr particle_group
) {
// Reference to the ProfileMap for the SYCLTarget inside the ParticleGroup.
// This is the ProfileMap which will be used for the ParticleLoops that have
// this ParticleGroup as an iteration set.
auto & profile_map = particle_group->sycl_target->profile_map;
const int rank = particle_group->sycl_target->comm_pair.rank_parent;
// Create some user written loops. The name of the loop is used in the
// profiling.
// Example where the name "pbc" and the time taken is recorded.
auto loop_pbc = particle_loop(
"pbc",
particle_group,
[=](auto P){
P.at(0) = Kernel::fmod(P.at(0) + 8.0, 8.0);
P.at(1) = Kernel::fmod(P.at(1) + 8.0, 8.0);
},
Access::write(Sym<REAL>("P"))
);
// Example where the kernel can be combined with additional metadata.
const REAL dt = 0.001;
auto loop_advect = particle_loop(
"advect",
particle_group,
Kernel::Kernel(
[=](auto P, auto V){
P.at(0) += dt * V.at(0);
P.at(1) += dt * V.at(1);
},
Kernel::Metadata(
Kernel::NumBytes(6 * sizeof(REAL)),
Kernel::NumFLOP(4)
)
),
Access::write(Sym<REAL>("P")),
Access::read(Sym<REAL>("V"))
);
// Enable recording of events and regions in the ProfileMap (default
// disabled).
profile_map.enable();
// Users can define their own regions and add them to the ProfileMap. The
// region starts on creation of the object.
auto r = ProfileRegion("NameFirstPart", "NameSecondPart");
// Do something to time here.
// End the region to profile
r.end();
// Add our custom region to the ProfileMap
profile_map.add_region(r);
// Run something to profile and record the regions from internal
// implementations and ParticleLoops.
for(int stepx=0 ; stepx<20 ; stepx++){
loop_advect->execute();
loop_pbc->execute();
particle_group->hybrid_move();
particle_group->cell_move();
}
// Disable recording of events and regions.
profile_map.disable();
// Write the regions and events to a json file with name
// regions_example.rank.json.
profile_map.write_events_json("regions_example", rank);
}
Plotting ProfileRegions#
We provide a helper script scripts/profile_region_plotting/profile_region_plotting.py to aid plotting the regions written to the JSON file. The requirements for this script are contained in the requirement.txt file and can be installed into a Python virtual environment as follows:
# Create and activate a virtual environment
$ python3 -m venv profile_region_plotting_env
$ source profile_region_plotting_env/bin/activate
# Install the dependencies into the virtual environment
(profile_region_plotting_env)$ pip install -r requirements.txt
# pip output omitted
# Run the script
$ python profile_region_plotting.py -s <start_time> -e <end_time> *.json
# Run with -h for a complete set of options.
The script plots time on the x-axis and MPI rank on the y-axis. On launch all recorded events, within the specified time window, are plotted. To simplify the view double click on an item in the legend on the right hand side to focus only on regions with that name. Other regions can then be added to the view one-by-one by clicking on them in the legend.