* Info: Selecting the 'perf-low-ppn' engine for node skylake
* Info: Process launched (host skylake, process 662029)miniqmc git branch: OMP_offload
miniqmc git commit: de45b04eb021c4b57ba6f4bee8f563c614d11135
number of ranks : 1, number of accelerators : 0
Number of orbitals/splines = 1536
Tile size = 1536
Number of tiles = 1
Number of electrons = 3072
Rmax = 1.7
AcceptanceRatio = 0.5
Iterations = 5
OpenMP threads = 16
Number of walkers per rank = 16
SPO coefficients size = 786432000 bytes (750 MB)
delayed update rank = 32
Using SoA distance table, Jastrow + einspline,
and determinant update.
==================================
Use --enable-timers= command line option to increase or decrease level of timing information
Stack timer profile
Timer Inclusive_time Exclusive_time Calls Time_per_call
Setup 0.0351 0.0351 1 0.035084904
ParticleSet:::update 0.0000 0.0000 1 0.000000994
Total 35.3024 0.2066 1 35.302359362
Diffusion 14.1569 0.0259 5 2.831380165
Complete Updates 0.0680 0.0000 5 0.013590227
Determinant::update 0.0679 0.0679 10 0.006793341
Current Gradient 0.4231 0.0145 15360 0.000027543
Determinant::ratio 0.4017 0.4017 15360 0.000026152
OneBodyJastrow 0.0040 0.0040 15360 0.000000259
TwoBodyJastrow 0.0028 0.0028 15360 0.000000184
Kinetic Energy 0.1030 0.1028 5 0.020609837
OneBodyJastrow 0.0001 0.0001 5 0.000025152
TwoBodyJastrow 0.0001 0.0001 5 0.000025216
New Gradient 9.2381 0.0168 15360 0.000601441
Determinant::ratio 0.0610 0.0610 15360 0.000003968
Determinant::spovgl 8.9471 0.1437 15360 0.000582496
Single-Particle Orbitals 8.8034 8.8034 15360 0.000573140
OneBodyJastrow 0.0253 0.0253 15360 0.000001648
TwoBodyJastrow 0.1879 0.1879 15360 0.000012235
ParticleSet:::acceptMove 1.5092 0.0065 7611 0.000198289
DTAAOMPTarget::update_e_e 1.4890 1.4890 7611 0.000195640
DTABOMPTarget::update_ion_e 0.0136 0.0136 7611 0.000001791
ParticleSet:::computeNewPosDT 0.2541 0.0083 15360 0.000016541
DTAAOMPTarget::move_e_e 0.2096 0.2096 15360 0.000013643
DTABOMPTarget::move_ion_e 0.0362 0.0362 15360 0.000002355
ParticleSet:::donePbyP 0.0000 0.0000 5 0.000001517
Update 2.5355 0.0079 7611 0.000333137
Determinant::update 2.2910 2.2910 7611 0.000301014
OneBodyJastrow 0.0015 0.0015 7611 0.000000196
TwoBodyJastrow 0.2351 0.2351 7611 0.000030887
Initialization 2.5955 0.4614 1 2.595456712
Determinant::inverse 0.2314 0.2314 2 0.115721759
Determinant::spovgl 1.8212 0.0406 2 0.910615461
Single-Particle Orbitals 1.7807 1.7807 3072 0.000579645
OneBodyJastrow 0.0034 0.0034 1 0.003381389
ParticleSet:::update 0.0451 0.0226 2 0.022533880
DTAAOMPTarget::evaluate_e_e 0.0173 0.0173 1 0.017285574
DTABOMPTarget::evaluate_ion_e 0.0052 0.0008 1 0.005160034
DTABOMPTarget::offload_ion_e 0.0044 0.0044 1 0.004362124
TwoBodyJastrow 0.0330 0.0330 1 0.032964125
Pseudopotential 18.3434 0.0285 5 3.668677637
Determinant::spoval 17.2988 0.0214 5359 0.003227982
Single-Particle Orbitals 17.2774 17.2774 5359 0.003223991
OneBodyJastrow 0.0223 0.0223 5359 0.000004154
ParticleSet:::update 0.7406 0.0086 5359 0.000138204
DTABOMPTarget::evaluate_e_virtual 0.6363 0.0035 5359 0.000118727
DTABOMPTarget::offload_e_virtual 0.6327 0.6327 5359 0.000118068
DTABOMPTarget::evaluate_ion_virtual 0.0958 0.0033 5359 0.000017880
DTABOMPTarget::offload_ion_virtual 0.0925 0.0925 5359 0.000017266
TwoBodyJastrow 0.2532 0.2532 5359 0.000047253
========== Throughput ============
Total throughput ( N_walkers * N_elec^3 / Total time ) = 1.31395e+10
Diffusion throughput ( N_walkers * N_elec^3 / Diffusion time ) = 3.27654e+10
Pseudopotential throughput ( N_walkers * N_elec^2 / Pseudopotential time ) = 8.23157e+06
* Info: Process finished (host skylake, process 662029)
* Info: Dumping samples (host skylake, process 662029)
* Info: Dumping source info for callchain nodes (host skylake, process 662029)
* Info: Building/writing metadata (host skylake)
* Info: Finished collect step (host skylake, process 662029)
Your experiment path is /home/eoseret/miniqmc/build_icx/bin/OV3_miniqmc_icx_zmm_16T/tools/lprof_npsu_run_0
To display your profiling results:
##############################################################################################################################################
# LEVEL | REPORT | COMMAND #
##############################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/miniqmc/build_icx/bin/OV3_miniqmc_icx_zmm_16T/tools/lprof_npsu_run_0 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/miniqmc/build_icx/bin/OV3_miniqmc_icx_zmm_16T/tools/lprof_npsu_run_0 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/miniqmc/build_icx/bin/OV3_miniqmc_icx_zmm_16T/tools/lprof_npsu_run_0 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/miniqmc/build_icx/bin/OV3_miniqmc_icx_zmm_16T/tools/lprof_npsu_run_0 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/miniqmc/build_icx/bin/OV3_miniqmc_icx_zmm_16T/tools/lprof_npsu_run_0 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/miniqmc/build_icx/bin/OV3_miniqmc_icx_zmm_16T/tools/lprof_npsu_run_0 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/miniqmc/build_icx/bin/OV3_miniqmc_icx_zmm_16T/tools/lprof_npsu_run_0 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/miniqmc/build_icx/bin/OV3_miniqmc_icx_zmm_16T/tools/lprof_npsu_run_0 #
##############################################################################################################################################