TN-04-2_ProposalParticleDsl =========================== .. meta:: :description: technical note :keywords: T/AW087/21,Support,and,Coordination,Report,2057699-TN-04-01,Proposal,for,a,Particle,DSL,Steven,Wright,and,Edward,Higgins,University,of,York,Gihan,Mudalige,,Zaman,Lantra,,Ben,McMillan,,and,Tom,Goffrey,University,of,Warwick,February,24,,2023,1,Summary,Project,NEPTUNE,(NEutrals,&,Plasma,TUrbulence,Numerics,for,the,Exascale),is,concerned,with,the,development,a,new,code,for,the,simulation,of,a,next,generation,fusion,reactor.,Both,fluid,and,particle,models,will,be,required,by,such,a,complex,simulation,code,,along,with,methods,of,coupling,the,two,models.,In,NEPTUNE,,the,fluid,model,is,likely,to,take,the,form,of,a,high-order,finite,element,method,,while,the,particle,model,will,necessarily,be,particle-in-cell,(PIC).,In,our,previous,reports,we,have,been,focussed,on,the,performance,and,portability,of,various,programming,models,and,domain,specific,languages,(DSLs),for,both,fluid,and,particle,methods.,In,this,project,we,have,identified,some,DSLs,that,can,be,used,to,develop,fluid,simulations,(such,as,Bout++[1,,2],,Nektar++,[3],,OPS/OP2,[4,,5,,6],,UFL,[7]),,but,have,not,identified,any,DSLs,focussed,on,particle,methods,,where,the,particles,must,interact,primarily,with,the,mesh,,as,in,PIC.,This,report,therefore,documents,our,progress,towards,developing,a,DSL,that,can,be,used,for,PIC,methods,,embedded,within,the,OP2,DSL.,1.1,The,Particle-in-Cell,Method,The,PIC,method,is,a,well,established,procedure,for,modelling,the,behaviour,of,charged,particles,in,the,presence,of,electric,and,magnetic,fields,[8,,9].,Discrete,particles,are,tracked,in,a,Lagrangian,frame,,while,the,electric,and,magnetic,fields,are,stored,on,stationary,points,on,a,fixed,Eulerian,mesh.,The,electric,and,magnetic,fields,evolve,according,to,Maxwell’s,equations,(Equations,(1)-(4)).,∇,·,⃗E,=,ρ,ϵ0,∇,·,⃗B,=,0,∂,⃗B,∂t,∂,⃗E,∂t,=,=,−∇,×,⃗E,1,µ0ϵ0,∇,×,⃗B,−,⃗J,1,ϵ0,While,the,force,experienced,by,a,particle,is,calculated,according,to,the,Lorentz,force,(Equation,(5)).,⃗F,=,q,(cid:16),⃗E,+,⃗v,×,⃗B,(cid:17),(1),(2),(3),(4),(5),A,typical,PIC,method,can,be,thought,of,as,two,coupled,solvers,where,one,is,responsible,for,updating,the,electric,and,magnetic,fields,according,the,Maxwell’s,equations,,while,another,calculates,the,movement,of,particles,according,to,the,Lorentz,force.,These,are,referred,to,as,the,field,solver,and,the,particle,mover,(sometimes,called,the,particle,pusher),,respectively.,The,main,time,loop,of,the,core,PIC,algorithm,consists,of:,solving,the,field,values,on,the,computational,1,mesh;,weighting,these,values,to,determine,the,fields,at,particle,locations;,updating,the,particle,velocities,and,positions;,and,depositing,the,particle,charge/current,back,to,grid,points.,The,algorithm,is,summarised,in,Figure,1.,Figure,1:,Flow,chart,summarising,the,key,components,of,the,PIC,algorithm,Since,the,field,solve,acts,upon,a,grid,or,a,mesh,,it,can,easily,be,implemented,using,numerous,DSLs,that,have,been,developed,for,such,simulations,(mentioned,previously).,The,goal,of,this,work,is,to,develop,a,DSL,extension,that,allows,us,to,implement,the,particle,mover,within,the,same,framework.,2,Developing,a,PIC,Domain,Specific,Language,Domain,Specific,Languages,(DSLs),allow,us,to,bridge,the,gap,between,domain,scientists,and,application,developers,by,allowing,the,domain,specialists,to,write,their,calculations,using,high,level,abstractions,specific,to,their,domain.,These,abstractions,typically,take,the,form,an,API,(Application,Programming,Interface),embedded,in,a,host,language,such,as,C/C++,or,Fortran.,A,DSL,and,its,associated,parser(s)/compiler(s),can,then,translate,this,high,level,abstraction,into,various,low-,level,parallelisations,such,as,OpenMP,,MPI,,CUDA,,HIP,,etc.,,introducing,optimisations,to,the,code,using,compiler,techniques,such,as,source-to-source,code,translation,and,code,generation.,The,lower,level,imple-,mentation,focuses,on,how,the,computation,can,be,executed,in,the,most,efficient,way,on,the,given,hardware,platform,,extracting,and,analysing,the,computation,,data,access/communication,and,synchronisation.,We,have,identified,numerous,DSLs,for,developing,structured,and,unstructured,mesh,computations,(e.g.,OP2,[5,,10],,OPS,[4,,6],,Bout++,[2,,1],,PATUS,[11],,UFL,[7],,PSyclone,[12],,etc.),,but,none,that,include,support,for,particle,methods.,In,this,report,,we,detail,our,progress,towards,developing,an,extension,to,the,OP2,DSL,with,a,focus,on,implementing,PIC,methods.,We,focus,on,OP2’s,loop-level,abstraction,as,a,first,step,towards,a,proposal,for,a,high-level,DSL,,like,that,found,in,Firedrake.,2,2.1,OP2:,A,DSL,for,Unstructured,Mesh,Computations,OP2,[5],is,a,high-level,abstraction,and,active,library,targeting,parallel,execution,of,Unstructured,mesh,applications.,It,has,the,capability,of,auto,generating,code,for,OpenMP,,MPI,,CUDA,,OpenACC,and,OpenCL,,using,source-to-source,translation.,It,has,a,well,defined,API,and,the,execution,algorithm,can,be,divided,in,to,four,distinct,parts:,(1),Defining,sets,(2),Defining,connectivity,(or,mapping),between,the,sets,(3),Defining,data,on,sets,(4),Operations,over,sets,,allowing,the,mesh,to,be,defined,completely,and,abstractly,For,example,,a,set,could,be,of,cells,,nodes,,edges,and/or,faces,of,the,mesh;,data,on,sets,could,be,the,current,over,an,edge;,connectivity,could,be,the,mapping,between,an,edge,to,its,connected,two,nodes;,and,the,operations,could,be,the,kernel,calculations,(solving,partial,differential,equations),by,iterating,over,edges.,Unstructured,mesh,applications,inherently,have,indirect,data,accesses,,and,the,main,challenges,in,developing,an,application,will,be,on,data,locality,,data,layout,in,memory,,data,dependencies,and,data,conflicts.,OP2,handles,some,of,these,issues,by,colouring,of,the,mesh,,using,atomics,(hardware,dependent),and,partitioning,with,halo,regions.,Since,the,PIC,DSL,that,is,to,be,developed,during,this,research,is,unstructured,mesh,,the,new,development,could,be,inspired,by,OP2.,2.2,OP-PIC:,Unstructured,Mesh,Particle-in-Cell,DSL,Design,As,stated,in,Section,1.1,,the,main,loop,of,a,standard,PIC,algorithm,involves,four,key,steps:,(1),Solve,Electric,and,Magnetic,Fields,(Field,Solver),(2),Weight,fields,to,particles,(3),Push/Move,particles,(4),Weight,particles,to,mesh,In,many,codes,,additional,routines,may,also,be,interleaved,,for,example,,injecting,particles,or,computing,particle,collisions.,In,all,of,these,routines,the,computations,typically,involves,iterating,over,particles,or,mesh,points,(i.e.,,cells,,nodes,,edges,etc.),and,solves,mathematical,equations,such,as,partial,differential,equations.,Similar,to,the,OP2,execution,algorithm,(briefly,described,in,Section,2.1),,the,proposed,DSL,comprises,of,the,same,four,distinct,parts.,Here,we,give,an,overview,of,the,API,for,particle,movement,within,a,simple,2D,quadrilateral,unstructured,mesh.,3,Figure,2:,An,example,unstructured,mesh,with,cells,and,nodes,2.2.1,Defining,sets,The,mesh,in,Figure,2,can,be,defined,as,a,collection,of,cells,(quadrilaterals),and,nodes.,There,are,6,cells,and,12,nodes,,which,can,be,declared,using,op,decl,set.,1,int,n_nodes,=,12;,int,n_cells,=,6;,2,3,op_set,nodes_set,=,op_decl_set,(,n_nodes,,,",mesh_nodes,",),;,4,op_set,cells_set,=,op_decl_set,(,n_cells,,,",mesh_cells,",),;,The,particle,sets,can,be,declared,with,op,decl,particle,set,allowing,multiple,particle,sets,to,be,defined,if,there,are,more,than,one,particle,species.,1,op_set,particles_set,=,o,p,_,d,e,c,l,_,p,a,r,t,i,c,l,e,_,s,e,t,(,",x,particles,",,,cells_set,),;,The,above,will,create,an,empty,particle,set,,assuming,that,particles,will,be,injected,during,the,main,loop.,However,,if,the,initial,particle,size,is,known,,it,could,be,set,when,defining,the,particle,set,with,the,API,call,below.,1,op_set,o,p,_,d,e,c,l,_,p,a,r,t,i,c,l,e,_,s,e,t,(,int,size,,,char,const,*,name,,,op_set,cells_set,),;,2.2.2,Defining,connectivity,(or,mapping),between,the,sets,The,connectivity,is,declared,through,mappings,between,sets,,using,op,decl,map.,Considering,the,mesh,in,Figure,2,,there,could,be,cell,to,node,mappings,as,well,as,cell,to,cell,mappings.,4,1,int,NODES_P,ER_CELL,=,4;,int,NEIGHBOUR_C,EL,LS,=,4;,2,3,int,*,cell_to_nodes,=,{1,,2,,5,,6,,,2,,3,,7,,6,,,3,,4,,7,,8,,,5,,6,,9,,10,,,6,,7,,10,,11,,,7,,8,,11,,12};,4,int,*,cell_to_cells,=,{2,,4,,,-1,,,-1,,,1,,3,,5,,,-1,,,2,,6,,,-1,,,-1,,,1,,5,,,-1,,,-1,,,2,,4,,6,,,-1,,,3,,5,,,-1,,,-1};,5,6,op_map,c,e,ll_,to_,node,s_m,ap,=,op_decl_map,(,cells_set,,,nodes_set,,,NODES_PER_CELL,,,cell_to_nodes,,,",ce,ll_,t,o_,n,od,e,s,_,m,a,p,",),;,7,8,9,op_map,c,e,ll_,to_,cell,s_m,ap,=,op_decl_map,(,cells_set,,,cells_set,,,NEIGHBOUR_CELLS,,,10,cell_to_cells,,,",ce,ll_to_c,el,l_m,a,p,",),;,Each,cell,belonging,to,cells,set,is,mapped,to,4,nodes,(NODES,PER,CELL),in,nodes,set.,Hence,,the,map,declaration,cell,to,nodes,map,has,a,dimension,of,4,,thus,its,indices,0-3,relates,to,the,first,cell,(C1),mapping,its,connected,{,N1,,N2,,N5,,N6,},nodes,,indices,4-7,relates,to,second,cell,(C2),mapping,its,connected,{,N2,,N3,,N7,,N6,},nodes,and,so,on.,As,shown,in,int*,cell,to,cells,,we,define,-1,as,a,mapping,indicating,that,there,is,no,element,on,that,direction.,Moreover,,since,the,mapping,between,particles,and,cells,is,dynamic,(particles,can,be,injected/removed,and,they,move,between,cells),,we,will,keep,cell,index,mapping,per,particle,as,data.,2.2.3,Defining,data,on,sets,Once,the,sets,and,its,connectivities,are,defined,,the,mesh,data,can,be,associated,with,cells,set,and,nodes,set,through,the,op,decl,dat,API,call.,Note,that,in,the,below,example,,node,dat1,is,declared,with,dimension,2,,allowing,to,it,store,{,X,,Y,},coordinates,,while,cell,dat1,stores,a,single,double-precision,value,per,set,element.,1,int,DIM,=,2;,2,double,*,d_cell1,=,{,cd1,,,cd2,,,cd3,,,cd4,,,cd5,,,cd6,};,3,double,*,d_node1,=,{,x1,,,y1,,,x2,,,y2,,,x3,,,y3,,,x4,,,y4,,,x5,,,y5,,,x6,,,y6,,,x7,,,y7,,,x8,,,y8,,,x9,,,y9,,,4,5,x10,,,y10,,,x11,,,y11,,,x12,,,y12,};,6,op_dat,cell_dat1,=,op_decl_dat,(,cells_set,,,1,,,",double,",,,sizeof,(,double,),,,(,char,*),d_cell1,,,",cell,field,name,",),;,7,8,9,op_dat,node_dat1,=,op_decl_dat,(,nodes_set,,,DIM,,,",double,",,,sizeof,(,double,),,,(,char,*),d_node1,,,10,",node,field,name,",),;,The,particle,dats,should,be,created,with,op,decl,particle,dat,and,the,arguments,will,be,similar,to,op,decl,dat,,except,when,defining,the,cell,index,dat.,Here,an,additional,argument,“true”,should,be,provided,indicating,that,it,will,be,the,cell,index,used,to,map,the,particle,to,its,containing,cell.,1,op_dat,part_dat1,=,o,p,_,d,e,c,l,_,p,a,r,t,i,c,l,e,_,d,a,t,(,particles_set,,,1,,,",double,",,,sizeof,(,double,),,,2,3,nullptr,,,",part,field,name,",),;,4,op_dat,p,ar,t,_cell_index,=,o,p,_,d,e,c,l,_,p,a,r,t,i,c,l,e,_,d,a,t,(,particles_set,,,1,,,",int,",,,sizeof,(,int,),,,5,nullptr,,,",part,cell,index,",,,true,),;,5,The,above,will,create,an,empty,particle,dat,,assuming,that,particles,will,be,injected,during,the,main,loop.,However,,if,the,initial,particle,size,is,known,and,if,the,set,is,created,by,providing,it,,the,corresponding,data,could,be,provided,as,an,array,(instead,of,nullptr),using,the,same,API,call.,1,op_dat,o,p,_,d,e,c,l,_,p,a,r,t,i,c,l,e,_,d,a,t,(,op_set,set,,,int,dim,,,char,const,*,type,,,int,size,,,char,*,data,,,2,char,const,*,name,,,bool,cell_index,=,false,),;,2.2.4,Operations,over,sets,All,of,the,numerically,intensive,operations,in,a,PIC,application,can,be,described,as,computations,over,sets,,accessing,data,though,the,mappings,(if,indirection,exists).,API:,op,par,loop,1,template,<,typename,...,T,,,typename,...,OPARG,>,2,void,op_par_loop,(,void,(*,kernel,),(,T,*...),,,char,const,*,name,,,op_set,set,,,3,op_iterate_type,iter_type,,,OPARG,...,arguments,),;,Consider,the,following,sequential,loop,,that,demonstrates,most,of,the,indirect,mappings.,This,uses,all,of,the,structures,&,declarations,defined,in,Sections,2.2.1,,2.2.2,and,2.2.3;,however,it,assumes,there,are,particles,in,particles,set.,1,void,e,x,amp,l,e_seq_loop,(,int,nparticles,,,int,*,cell_to_node,,,int,*,cell_idx,,,double,*,cell_dat,,,double,*,node_dat,,,double,*,part_dat,),{,for,(,int,i,=,0;,i,<,nparticles,;,i,++),{,int,cell_index,=,cell_idx,[,i,];,int,node0_mapping,=,cell_to_node,[,NODES_PER,_CELL,*,cell_index,+,0];,int,node1_mapping,=,cell_to_node,[,NODES_PER,_CELL,*,cell_index,+,1];,int,node2_mapping,=,cell_to_node,[,NODES_PER,_CELL,*,cell_index,+,2];,int,node3_mapping,=,cell_to_node,[,NODES_PER,_CELL,*,cell_index,+,3];,double,inc_value,=,(,part_dat,[,i,],+,cell_dat,[,cell_index,]),;,//,Assume,only,X,value,of,node,data,need,to,increment,node_dat,[,DIM,*,node0_mapping,+,0],+=,inc_value,;,node_dat,[,DIM,*,node1_mapping,+,0],+=,inc_value,;,node_dat,[,DIM,*,node2_mapping,+,0],+=,inc_value,;,node_dat,[,DIM,*,node3_mapping,+,0],+=,inc_value,;,part_dat,[,i,],=,0.0;,},2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,},The,sequential,example,loop,above,iterates,over,all,the,particles,,computes,the,sum,of,the,particle,dat,and,its,corresponding,cell,dat,,and,increments,all,4,connected,node,dats,with,the,sum,calculated.,Finally,the,particle,dat,is,assigned,with,a,new,value,(0.0).,Here,the,particle,should,map,to,its,containing,cell,though,its,cell,index,and,maps,all,four,nodes,connected,to,that,cell,,to,compute,the,reduction,operation,(SUM).,6,Even,though,the,sequential,code,looks,simple,,it,would,be,quite,complex,if,the,computation,is,to,be,done,in,parallel,(OpenMP,,MPI,and/or,GPUs),,since,race,conditions,needs,to,be,considered,when,executing,increments,to,shared,nodes.,However,,together,with,the,below,API,calls,and,code-to-code,translation,,the,proposed,DSL,removes,all,of,the,development,complexities,from,the,domain,specialist,and,should,provide,an,optimised,code,to,run,on,their,intended,platform.,1,void,example_kernel,(,double,*,part_data,,,const,double,*,cell_data,,,double,*,node0_data,,,double,inc_value,=,(*,part_data,+,*,cell_data,),;,double,*,node1_data,,,double,*,node2_data,,,double,*,node3_data,),{,//,Assume,only,X,value,of,node,data,need,to,increment,node0_data,[0],+=,inc_value,;,node1_data,[0],+=,inc_value,;,node2_data,[0],+=,inc_value,;,node3_data,[0],+=,inc_value,;,*,part_data,=,0.0;,2,3,4,5,6,7,8,9,10,11,12,},13,14,op_par_loop,(,example_kernel,,,",e,x,a,m,p,l,e_,o,p,_,p,a,r,_,l,o,o,p,",,,particles_set,,,OP_ITERATE_ALL,,,op_arg_dat,(,part_dat1,,,OP_RW,),,,op_arg_dat,(,cell_dat1,,,OP_READ,,,true,),,,op_arg_dat,(,node_dat1,,,0,,,cell_to_nodes_map,,,OP_INC,,,true,),,,op_arg_dat,(,node_dat1,,,1,,,cell_to_nodes_map,,,OP_INC,,,true,),,,op_arg_dat,(,node_dat1,,,2,,,cell_to_nodes_map,,,OP_INC,,,true,),,,op_arg_dat,(,node_dat1,,,3,,,cell_to_nodes_map,,,OP_INC,,,true,),15,16,17,18,19,20,21,22,),;,An,application,developer,could,write,the,elemental,kernel,function,example,kernel,and,the,op,par,loop,declaration,as,above.,Declaring,the,set,and,OP,ITERATE,ALL,enables,the,DSL,to,iterate,all,elements,of,that,given,set.,In,this,case,,the,elemental,kernel,function,takes,6,arguments,and,the,loop,declaration,requires,the,access,method,of,the,data,(e.g.,OP,READ,,OP,WRITE,,OP,INC).,After,the,access,specifier,,a,Boolean,true,value,should,be,provided,to,all,op,arg,dats,that,need,mapping,through,the,particle,cell,index.,In,addition,,the,mapping,offset,(0,1,2,3),and,the,op,map,mapping,should,be,provided,to,access,the,correct,node,connected,to,the,cell.,Having,neither,Boolean,true,and/or,a,mapping,indicates,that,this,data,should,be,directly,mapped.,API:,op,par,loop,particle,1,template,<,typename,...,T,,,typename,...,OPARG,>,2,void,o,p,_,p,a,r,_,l,o,o,p,_,p,a,r,t,i,c,l,e,(,void,(*,kernel,),(,T,*...),,,char,const,*,name,,,op_set,set,,,3,op_iterate,_type,iter_type,,,OPARG,...,arguments,),;,Although,most,of,PIC,equations,can,be,written,as,op,par,loop,API,calls,over,particle,set,,cells,set,or,nodes,set,,particle,movement,(handling,the,change,of,cell,index,of,a,particle,,during,inter,cell,movement),has,a,different,communication,pattern.,To,cater,to,that,requirement,,a,new,API,call,op,par,loop,particle,is,introduced,to,the,API.,7,Similar,to,op,par,loop,,the,application,developer,should,implement,op,par,loop,particle,declarations,with,similar,constructs,,however,it,will,only,loop,over,a,particle,set,created,using,op,decl,particle,set.,Nevertheless,,the,elemental,function,should,always,have,an,int*,move,status,given,as,the,first,argument,,which,should,be,changed,to,MOVE,DONE,,NEED,MOVE,,NEED,REMOVE,by,the,application,developer,within,the,elemental,function.,NEED,REMOVE,The,particle,will,be,removed,from,the,particle,set.,MOVE,DONE,The,final,cell,index,assigned,in,the,kernel,will,be,set,to,the,particle,and,the,necessary,communication,of,the,particle,will,be,handled,by,the,DSL.,NEED,MOVE,The,same,elemental,kernel,will,be,called,again,with,the,data,corresponding,to,the,new,cell,index,set,during,the,previous,elemental,function,call,to,the,same,particle.,An,example,code,of,an,elemental,function,required,for,op,par,loop,particle,is,below.,1,void,e,xample2_kernel,(,int,*,move_status,,,int,*,cell_index,,,double,*,...),{,{,},//,Compute,logic,involving,particle,and,mesh,data,if,(,i,s_,in,s,id,e,_t,he,_,ce,l,l,),{,*,move_status,=,MOVE_DONE,;,},else,if,(,n,e,e,d,_,t,o,_,r,e,m,o,v,e,_,f,r,o,m,_,m,e,s,h,),{,*,move_status,=,NEED_REMOVE,;,},else,{,//,n,e,e,d,_,t,o,_,s,e,a,r,c,h,_,a,_,d,i,f,f,e,r,e,n,t,_,c,e,l,l,_,i,n,_,t,h,e,_,m,e,s,h,*,move_status,=,NEED_MOVE,;,(*,cell_index,),++;,//,or,compute,the,most,probable,cell,index,to,search,next,},2,3,4,5,6,7,8,9,10,11,12,13,14,},API:,op,increase,particle,count,In,order,to,add,particles,to,the,simulation,,the,particle,count,of,the,set,should,be,increased,,hence,the,application,developer,should,use,the,below,API,call.,1,void,o,p,_,i,n,c,r,e,a,s,e,_,p,a,r,t,i,c,l,e,_,c,o,u,n,t,(,op_set,particles_set,,,int,n,u,m,_,p,a,r,t,i,c,l,e,s,_,t,o,_,i,n,s,e,r,t,),;,Afterwards,,both,the,op,par,loop,and,op,par,loop,particle,declarations,can,be,used,to,iterate,over,the,new,particles,by,changing,op,iterate,type,to,OP,ITERATE,INJECTED.,API:,op,particle,sort,To,gain,better,particle,locality,during,kernel,calls,and,in,applications,where,double,indirection,is,present,(e.g.,,particle→cell→node),,sorting,particles,according,to,its,residing,cell,index,is,required,(after,particle,8,injections,and,particle,movements).,1,void,o,p,_pa,r,ticle_sort,(,op_set,set,),;,However,,after,calling,op,particle,sort,,the,OP,ITERATE,INJECTED,will,not,iterate,any,particles,at,all,,since,the,added,particles,are,no,longer,considered,new,to,the,simulation.,2.3,OP-PIC:,Unstructured,Mesh,Particle,in,Cell,DSL,Implementation,Despite,not,having,a,complete,unstructured,mesh,3D,electromagnetic,FEM,PIC,code,available,to,demon-,strate,the,proposed,PIC,DSL,functionality,,we,have,converted,three,PIC,codes,to,to,exhibit,the,use,of,API,calls,&,design,,with,unstructured,type,indirect,data,mappings.,They,are,namely,,•,SimPIC,,an,electrostatic,1D,FDTD,structured,mesh,PIC,code,•,CabanaPIC,,an,electromagnetic,3D,FDTD,structured,mesh,PIC,code,•,FemPIC,,an,electrostatic,3D,FEM,unstructured,mesh,PIC,code,For,both,SimPIC,and,CabanaPIC,,the,structured,stencil,type,computations,were,converted,to,unstructured,type,indirect,data,mappings,(which,is,loaded,from,a,file,prior,simulation).,The,new,SimPIC,and,CabanaPIC,codes,are,serial,implementations,written,in,C++,(without,MPI),and,the,calculated,particle,data,&,grid,point,data,are,verified,to,be,equal,to,its,original,implementation.,FemPIC,is,a,sequential,electrostatic,3D,unstructured,mesh,FEM,PIC,example,code,written,in,C++,as,a,part,of,the,course,https://www.particleincell.com/2015/fem-pic/,,that,contains,an,inject,particles,routine,as,an,addition,to,the,usual,PIC,algorithm.,FemPIC,code,is,originally,unstructured,and,the,new,sequential,&,OpenMP,versions,are,written,in,C++,,utilising,PETSc,(sparse,matrix,linear,solvers),inside,the,PIC,Field,Solver,,instead,of,the,DSL,API,calls,(the,calculations,need,to,be,broken,down,to,kernels,to,call,the,APIs,,which,will,be,the,focus,of,future,work).,The,current,implementations,can,be,found,at,https://github.com/OP-DSL/OP-PIC.,It,should,be,noted,that,the,current,implementations,include,sequential,and,OpenMP,parallelisations,,but,do,not,include,MPI,parallelisation.,9,References,[1],Benjamin,Daniel,Dudson,,Peter,Alec,Hill,,David,Dickinson,,Joseph,Parker,,Adam,Dempsey,,Andrew,Allen,,Arka,Bokshi,,Brendan,Shanahan,,Brett,Friedman,,Chenhao,Ma,,David,Schw¨orer,,Dmitry,Mey-,erson,,Eric,Grinaker,,George,Breyiannia,,Hasan,Muhammed,,Haruki,Seto,,Hong,Zhang,,Ilon,Joseph,,Jarrod,Leddy,,Jed,Brown,,Jens,Madsen,,John,Omotani,,Joshua,Sauppe,,Kevin,Savage,,Licheng,Wang,,Luke,Easy,,Marta,Estarellas,,Matt,Thomas,,Maxim,Umansky,,Michael,Løiten,,Minwoo,Kim,,M,Leconte,,Nicholas,Walkden,,Olivier,Izacard,,Pengwei,Xi,,Peter,Naylor,,Fabio,Riva,,Sanat,Tiwari,,Sean,Farley,,Simon,Myers,,Tianyang,Xia,,Tongnyeol,Rhee,,Xiang,Liu,,Xueqiao,Xu,,and,Zhanhui,Wang.,BOUT++,,10,2020.,[2],B,D,Dudson,,M,V,Umansky,,X,Q,Xu,,P,B,Snyder,,and,H,R,Wilson.,BOUT++:,a,framework,for,parallel,plasma,fluid,simulations.,arXiv,,physics.plasm-ph:0810.5757,,Nov,2008.,[3],C.D.,Cantwell,,D.,Moxey,,A.,Comerford,,A.,Bolis,,G.,Rocco,,G.,Mengaldo,,D.,De,Grazia,,S.,Yakovlev,,J.-E.,Lombard,,D.,Ekelschot,,B.,Jordi,,H.,Xu,,Y.,Mohamied,,C.,Eskilsson,,B.,Nelson,,P.,Vos,,C.,Biotto,,R.M.,Kirby,,and,S.J.,Sherwin.,Nektar++:,An,open-source,spectral/hp,element,framework.,Computer,Physics,Communications,,192:205–219,,2015.,[4],Istv´an,Z.,Reguly,,Gihan,R.,Mudalige,,Michael,B.,Giles,,Dan,Curran,,and,Simon,McIntosh-Smith.,The,OPS,Domain,Specific,Abstraction,for,Multi-block,Structured,Grid,Computations.,In,Proceedings,of,the,2014,Fourth,International,Workshop,on,Domain-Specific,Languages,and,High-Level,Frameworks,for,High,Performance,Computing,,WOLFHPC,’14,,pages,58–67,,Washington,,DC,,USA,,2014.,IEEE,Computer,Society.,[5],G.,R.,Mudalige,,M.,B.,Giles,,I.,Reguly,,C.,Bertolli,,and,P.,H.,J.,Kelly.,OP2:,An,active,library,framework,for,solving,unstructured,mesh-based,applications,on,multi-core,and,many-core,architectures.,In,2012,Innovative,Parallel,Computing,(InPar),,pages,1–12,,May,2012.,[6],Istv´an,Z,Reguly,,Gihan,R,Mudalige,,Michael,B,Giles,,Dan,Curran,,and,Simon,McIntosh-Smith.,The,ops,domain,specific,abstraction,for,multi-block,structured,grid,computations.,In,2014,Fourth,International,Workshop,on,Domain-Specific,Languages,and,High-Level,Frameworks,for,High,Performance,Computing,,pages,58–67.,IEEE,,2014.,[7],Florian,Rathgeber,,David,A.,Ham,,Lawrence,Mitchell,,Michael,Lange,,Fabio,Luporini,,Andrew,T.,T.,Mcrae,,Gheorghe-Teodor,Bercea,,Graham,R.,Markall,,and,Paul,H.,J.,Kelly.,Firedrake:,Automating,the,Finite,Element,Method,by,Composing,Abstractions.,ACM,Trans.,Math.,Softw.,,43(3):24:1–24:27,,December,2016.,[8],C.,K.,Birdsall,and,A.,B.,Langdon.,Plasma,Physics,via,Computer,Simulation.,Plasma,Physics,Series.,Institute,of,Physics,Publishing,,Bristol,BS1,6BE,,UK,,1991.,[9],John,M.,Dawson.,Particle,Simulation,of,Plasmas.,Reviews,of,Modern,Physics,,55:403–447,,Apr,1983.,10,[10],Gihan,R,Mudalige,,Mike,B,Giles,,I,Reguly,,Carlo,Bertolli,,and,Paul,HJ,Kelly.,Op2:,An,active,library,framework,for,solving,unstructured,mesh-based,applications,on,multi-core,and,many-core,architectures.,In,2012,Innovative,Parallel,Computing,(InPar),,pages,1–12.,IEEE,,2012.,[11],Matthias,Christen,,Olaf,Schenk,,and,Helmar,Burkhart.,Patus:,A,code,generation,and,autotuning,framework,for,parallel,iterative,stencil,computations,on,modern,microarchitectures.,In,2011,IEEE,International,Parallel,&,Distributed,Processing,Symposium,,pages,676–687.,IEEE,,2011.,[12],PSyclone,Project,,2018.,http://psyclone.readthedocs.io/.,[13],Florian,Rathgeber,,Graham,R,Markall,,Lawrence,Mitchell,,Nicolas,Loriant,,David,A,Ham,,Carlo,Bertolli,,and,Paul,HJ,Kelly.,PyOP2:,A,high-level,framework,for,performance-portable,simulations,on,unstructured,meshes.,In,2012,SC,Companion:,High,Performance,Computing,,Networking,Storage,and,Analysis,,pages,1116–1123.,IEEE,,2012.,[14],G.D.,Balogh,,G.R.,Mudalige,,I.Z.,Reguly,,S.F.,Antao,,and,C.,Bertolli.,Op2-clang:,A,source-to-source,translator,using,clang/llvm,libtooling.,In,2018,IEEE/ACM,5th,Workshop,on,the,LLVM,Compiler,Infrastructure,in,HPC,(LLVM-HPC),,pages,59–70,,2018.,11,Corrections/Clarifications,The,DSL,that,we,propose,right,now,is,a,loop-level,abstraction,,specialising,code,generation,for,the,hardware.,A,higher-level,abstraction,allowing,developers,to,specify,problems,in,terms,of,mathematical,equations,can,be,built,on,top,of,this.,Similar,high-level,abstractions,have,been,developed,with,OpenSBLI1,(generating,OPS,loop-level,DSL,[6]),and,Firedrake2,(generating,PyOP2,[13]).,Such,a,higher,level,abstraction,could,be,developed,targeting,the,OP-PIC,DSL,as,the,backend,,however,we,have,not,focused,on,this,at,this,time.,We,will,need,a,better,understanding,of,specifying,the,problem,at,a,mathematical,level,for,this.,[We,do,not,have,expertise,in,these,equations,,so,we,will,need,to,work,with,Physicists/UKAEA,scientists],•,Is,an,op,set,decomposed,over,MPI,ranks?,The,op,sets,exist,on,each,rank,and,they,are,just,a,symbol,to,denote,the,collection,of,dats,and,maps,(particles/nodes/cells,etc.),on,that,rank.,•,“we,will,keep,cell,index,mapping,per,particle,as,data”,clarification.,In,OP2,,all,the,mappings,(one,op,set,to,another,op,set),are,placed,inside,op,maps,,which,are,static.,Even,though,cell,index,is,a,mapping,between,a,particle,set,to,its,underlying,cell,set,,the,dy-,namic,nature,of,particles,lets,us,to,store,the,cell,index,in,particles,as,data,,however,treat,it,differently.,•,Cell,data,-,looks,like,CellDat,in,NESO-Particles.,As,of,our,knowledge,,yes.,However,there,could,be,differences,which,we,are,not,aware,of.,•,Section,2.2.3,particle,data,-,These,look,like,they,have,a,one-to-one,mapping,with,PPMD/NESO-,Particles,ParticleDat,objects,There,could,be,differences,which,we,are,not,aware,of.,op,dats,have,the,capability,of,arranging,the,data,in,the,dat,as,AOS/SOA,depending,on,the,hardware,architecture.,•,Is,“elemental,function”,another,name,for,kernel?,Yes,,we,could,call,it,as,elemental,kernel.,•,op,increase,particle,count:,I,think,there,will,be,a,use,case,for,an,API,where,particles,can,be,initialised,(e.g.,from,a,particular,distribution),then,injected,As,explained,in,Section,2.2.1,,a,particle,set,can,be,defined,by,providing,the,initial,particle,count,(size,for,the,set).,This,allow,the,user,to,initialise,particle,values,for,dats,,prior,to,the,main,loop.,As,explained,in,Section,2.2.4,,initialisation,of,injected,particles,during,the,main,loop,can,be,done,using,an,op,par,loop,and,op,par,loop,particle,declarations,with,iterate,type=OP,ITERATE,INJECTED,(here,we,could,copy,data,from,another,dat,to,particle,data,if,required).,•,op,particle,sort;,If,I,sort,the,particles,then,call,a,parloop,that,accesses,a,grid,data,set,(INC,or,READ),will,the,code,generation,exploit,,e.g.,combine,writes,,remove,indirections?,1https://opensbli.github.io/,2https://www.firedrakeproject.org,12,Since,data,on,sets,are,kept,per,MPI,rank,,the,particle,sorting,will,be,done,per,MPI,rank,(the,particles,will,belong,to,the,cells,of,the,current,MPI,rank).,The,OP-PIC,DSL,is,capable,of,generating,code,without,data,hazards,and,make,halo,exchanges,when,necessary.,•,Please,can,you,give,an,overview,(or,point,to),how,this,DSL,(embedded,in,C++?),is,construed,and,code,generation,occurs,(maybe,for,cases,where,there,is,an,existing,parallel,loop,implementation),–,OP2-Clang:,A,Source-to-Source,Translator,Using,Clang/LLVM,LibTooling,[14],–,https://warwick.ac.uk/fac/sci/dcs/people/gihan_mudalige/talksandpresentations/,keynote-europardslaug2022.pdf,(slide,7,&,8),–,https://warwick.ac.uk/fac/sci/dcs/people/gihan_mudalige/talksandpresentations/,dslworkshoptalk_oct2020.pdf,13 :pdfembed:`src:_static/TN-04-2_ProposalParticleDsl.pdf, height:1600, width:1100, align:middle`