* Info: Selecting the 'perf-low-ppn' engine for node ip-172-31-42-13
* Info: "ref-cycles" not supported on ip-172-31-42-13: fallback to "cpu-clock"
* Info: Process launched (host ip-172-31-42-13, process 5890)miniqmc git branch: OMP_offload
miniqmc git commit: 34c39aa17b79f2e7e5c41ff1896cb0847b88715a
number of ranks : 1, number of accelerators : 0
Number of orbitals/splines = 3072
Tile size = 3072
Number of tiles = 1
Number of electrons = 6144
Rmax = 1.7
AcceptanceRatio = 0.5
Iterations = 5
OpenMP threads = 64
Number of walkers per rank = 64
SPO coefficients size = 1572864000 bytes (1500 MB)
delayed update rank = 32
Using the reference implementation for Jastrow,
determinant update, and distance table + einspline of the
reference implementation
==================================
Use --enable-timers= command line option to increase or decrease level of timing information
Stack timer profile
Timer Inclusive_time Exclusive_time Calls Time_per_call
Setup 0.0718 0.0718 1 0.071813036
ParticleSet:::update 0.0000 0.0000 1 0.000002705
Total 151.4851 0.1283 1 151.485116126
Diffusion 96.9721 0.0710 5 19.394417403
Complete Updates 1.3017 0.0000 5 0.260332629
DeterminantRef::update 1.3016 1.3016 10 0.130161583
Current Gradient 5.5164 0.0692 30720 0.000179572
DeterminantRef::ratio 5.4074 5.4074 30720 0.000176023
OneBodyJastrowRef 0.0220 0.0220 30720 0.000000715
TwoBodyJastrowRef 0.0179 0.0179 30720 0.000000582
Kinetic Energy 0.9528 0.9519 5 0.190563111
OneBodyJastrowRef 0.0005 0.0005 5 0.000106405
TwoBodyJastrowRef 0.0004 0.0004 5 0.000082681
New Gradient 15.6776 0.0644 30720 0.000510340
DeterminantRef::ratio 0.1787 0.1787 30720 0.000005816
DeterminantRef::spovgl 14.1940 0.2616 30720 0.000462045
Single-Particle Orbitals 13.9325 13.9325 30720 0.000453531
OneBodyJastrowRef 0.1684 0.1684 30720 0.000005480
TwoBodyJastrowRef 1.0722 1.0722 30720 0.000034903
ParticleSet:::acceptMove 12.6138 0.0364 15371 0.000820621
DTAAOMPTarget::update_e_e 12.5071 12.5071 15371 0.000813681
DTABOMPTarget::update_ion_e 0.0702 0.0702 15371 0.000004569
ParticleSet:::computeNewPosDT 2.0128 0.0415 30720 0.000065522
DTAAOMPTarget::move_e_e 1.7757 1.7757 30720 0.000057804
DTABOMPTarget::move_ion_e 0.1956 0.1956 30720 0.000006368
ParticleSet:::donePbyP 0.0000 0.0000 5 0.000001092
Update 58.8259 0.0261 15371 0.003827070
DeterminantRef::update 57.2317 57.2317 15371 0.003723353
OneBodyJastrowRef 0.0062 0.0062 15371 0.000000402
TwoBodyJastrowRef 1.5620 1.5620 15371 0.000101619
Initialization 9.4000 4.4335 1 9.400037529
DeterminantRef::inverse 1.9489 1.9489 2 0.974473267
DeterminantRef::spovgl 2.6771 0.0418 2 1.338530714
Single-Particle Orbitals 2.6353 2.6353 6144 0.000428916
OneBodyJastrowRef 0.0051 0.0051 1 0.005122783
ParticleSet:::update 0.2504 0.0736 2 0.125180787
DTAAOMPTarget::evaluate_e_e 0.1436 0.1436 1 0.143628398
DTABOMPTarget::evaluate_ion_e 0.0331 0.0002 1 0.033089891
DTABOMPTarget::offload_ion_e 0.0329 0.0329 1 0.032912600
TwoBodyJastrowRef 0.0850 0.0850 1 0.085029144
Pseudopotential 44.9847 0.2179 5 8.996933562
DeterminantRef::spoval 34.9860 0.7828 10215 0.003424959
Single-Particle Orbitals 34.2032 34.2032 122580 0.000279027
OneBodyJastrowRef 0.1117 0.1117 10215 0.000010931
ParticleSet:::update 7.4980 0.0380 10215 0.000734023
DTABOMPTarget::evaluate_e_virtual 6.7207 0.0135 10215 0.000657925
DTABOMPTarget::offload_e_virtual 6.7072 6.7072 10215 0.000656603
DTABOMPTarget::evaluate_ion_virtual 0.7393 0.0117 10215 0.000072374
DTABOMPTarget::offload_ion_virtual 0.7276 0.7276 10215 0.000071227
TwoBodyJastrowRef 2.1711 2.1711 10215 0.000212540
========== Throughput ============
Total throughput ( N_walkers * N_elec^3 / Total time ) = 9.79859e+10
Diffusion throughput ( N_walkers * N_elec^3 / Diffusion time ) = 1.53069e+11
Pseudopotential throughput ( N_walkers * N_elec^2 / Pseudopotential time ) = 5.37054e+07
* Info: Process finished (host ip-172-31-42-13, process 5890)
* Info: Dumping samples (host ip-172-31-42-13, process 5890)
* Info: Dumping source info for callchain nodes (host ip-172-31-42-13, process 5890)
* Info: Building/writing metadata (host ip-172-31-42-13)
* Info: Finished collect step (host ip-172-31-42-13, process 5890)
Your experiment path is /home/kcamus/runs_miniqmc/miniqmc_gcc_o52/tools/lprof_npsu_run_0
To display your profiling results:
############################################################################################################################
# LEVEL | REPORT | COMMAND #
############################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/kcamus/runs_miniqmc/miniqmc_gcc_o52/tools/lprof_npsu_run_0 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/kcamus/runs_miniqmc/miniqmc_gcc_o52/tools/lprof_npsu_run_0 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/kcamus/runs_miniqmc/miniqmc_gcc_o52/tools/lprof_npsu_run_0 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/kcamus/runs_miniqmc/miniqmc_gcc_o52/tools/lprof_npsu_run_0 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/kcamus/runs_miniqmc/miniqmc_gcc_o52/tools/lprof_npsu_run_0 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/kcamus/runs_miniqmc/miniqmc_gcc_o52/tools/lprof_npsu_run_0 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/kcamus/runs_miniqmc/miniqmc_gcc_o52/tools/lprof_npsu_run_0 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/kcamus/runs_miniqmc/miniqmc_gcc_o52/tools/lprof_npsu_run_0 #
############################################################################################################################