
Executable Output

miniqmc git branch: OMP_offload
miniqmc git commit: 34c39aa17b79f2e7e5c41ff1896cb0847b88715a

number of ranks : 1, number of accelerators : 0
Number of orbitals/splines = 3072
Tile size = 3072
Number of tiles = 1
Number of electrons = 6144
Rmax = 1.7
AcceptanceRatio = 0.5
Iterations = 5
OpenMP threads = 52
Number of walkers per rank = 52

SPO coefficients size = 1572864000 bytes (1500 MB)
delayed update rank = 32
Using the reference implementation for Jastrow, 
determinant update, and distance table + einspline of the 
reference implementation 

Use --enable-timers= command line option to increase or decrease level of timing information
Stack timer profile
Timer                                       Inclusive_time  Exclusive_time  Calls       Time_per_call
Setup                                          0.0510     0.0510              1       0.051035645
  ParticleSet:::update                         0.0000     0.0000              1       0.000007983
Total                                        189.0101     3.5041              1     189.010072927
  Diffusion                                  103.1838     0.1842              5      20.636763694
    Complete Updates                           0.7786     0.0001              5       0.155725454
      DeterminantRef::update                   0.7786     0.7786             10       0.077857620
    Current Gradient                           5.9472     0.0979          30720       0.000193593
      DeterminantRef::ratio                    5.8071     5.8071          30720       0.000189035
      OneBodyJastrowRef                        0.0227     0.0227          30720       0.000000738
      TwoBodyJastrowRef                        0.0195     0.0195          30720       0.000000634
    Kinetic Energy                             1.0035     1.0025              5       0.200690110
      OneBodyJastrowRef                        0.0006     0.0006              5       0.000126852
      TwoBodyJastrowRef                        0.0004     0.0004              5       0.000072756
    New Gradient                              28.4427     0.1244          30720       0.000925870
      DeterminantRef::ratio                    0.4435     0.4435          30720       0.000014435
      DeterminantRef::spovgl                  25.1043     1.2195          30720       0.000817197
        Single-Particle Orbitals              23.8848    23.8848          30720       0.000777499
      OneBodyJastrowRef                        0.3369     0.3369          30720       0.000010967
      TwoBodyJastrowRef                        2.4337     2.4337          30720       0.000079221
    ParticleSet:::acceptMove                  14.4576     0.0474          15371       0.000940573
      DTAAOMPTarget::update_e_e               14.2770    14.2770          15371       0.000928830
      DTABOMPTarget::update_ion_e              0.1331     0.1331          15371       0.000008659
    ParticleSet:::computeNewPosDT              3.2083     0.0553          30720       0.000104437
      DTAAOMPTarget::move_e_e                  2.9111     2.9111          30720       0.000094762
      DTABOMPTarget::move_ion_e                0.2419     0.2419          30720       0.000007874
    ParticleSet:::donePbyP                     0.0000     0.0000              5       0.000003093
    Update                                    49.1618     0.0389          15371       0.003198346
      DeterminantRef::update                  46.5697    46.5697          15371       0.003029714
      OneBodyJastrowRef                        0.0089     0.0089          15371       0.000000576
      TwoBodyJastrowRef                        2.5443     2.5443          15371       0.000165523
  Initialization                               9.2946     2.0994              1       9.294563904
    DeterminantRef::inverse                    2.4640     2.4640              2       1.232009293
    DeterminantRef::spovgl                     4.1346     0.1932              2       2.067278149
      Single-Particle Orbitals                 3.9413     3.9413           6144       0.000641491
    OneBodyJastrowRef                          0.0205     0.0205              1       0.020478534
    ParticleSet:::update                       0.3890     0.0710              2       0.194477142
      DTAAOMPTarget::evaluate_e_e              0.3029     0.3029              1       0.302939119
      DTABOMPTarget::evaluate_ion_e            0.0150     0.0003              1       0.014981242
        DTABOMPTarget::offload_ion_e           0.0147     0.0147              1       0.014722986
    TwoBodyJastrowRef                          0.1872     0.1872              1       0.187200390
  Pseudopotential                             73.0276     0.1484              5      14.605520151
    DeterminantRef::spoval                    61.3432     2.4833          10215       0.006005211
      Single-Particle Orbitals                58.8599    58.8599         122580       0.000480175
    OneBodyJastrowRef                          0.0820     0.0820          10215       0.000008031
    ParticleSet:::update                       9.2287     0.0296          10215       0.000903442
      DTABOMPTarget::evaluate_e_virtual        8.4282     0.0143          10215       0.000825084
        DTABOMPTarget::offload_e_virtual       8.4140     8.4140          10215       0.000823686
      DTABOMPTarget::evaluate_ion_virtual      0.7708     0.0139          10215       0.000075456
        DTABOMPTarget::offload_ion_virtual     0.7569     0.7569          10215       0.000074098
    TwoBodyJastrowRef                          2.2252     2.2252          10215       0.000217841

========== Throughput ============ 

Total throughput ( N_walkers * N_elec^3 / Total time ) = 6.38075e+10
Diffusion throughput ( N_walkers * N_elec^3 / Diffusion time ) = 1.16881e+11
Pseudopotential throughput ( N_walkers * N_elec^2 / Pseudopotential time ) = 2.68793e+07

* Info: Dumping samples (host skylake, process 2109675)
* Info: Dumping source info for callchain nodes (host skylake, process 2109675)
* Info: Building/writing metadata (host skylake)
* Info: Finished collect step (host skylake, process 2109675)

Your experiment path is /home/kcamus/miniapp_intel/miniqmc/runs/miniqmc_422_zmmhigh_o52_prompt/tools/lprof_npsu_run_0

To display your profiling results:
#    LEVEL    |     REPORT     |                                                        COMMAND                                                         #
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/miniapp_intel/miniqmc/runs/miniqmc_422_zmmhigh_o52_prompt/tools/lprof_npsu_run_0      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/miniapp_intel/miniqmc/runs/miniqmc_422_zmmhigh_o52_prompt/tools/lprof_npsu_run_0  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/miniapp_intel/miniqmc/runs/miniqmc_422_zmmhigh_o52_prompt/tools/lprof_npsu_run_0  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/miniapp_intel/miniqmc/runs/miniqmc_422_zmmhigh_o52_prompt/tools/lprof_npsu_run_0  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/miniapp_intel/miniqmc/runs/miniqmc_422_zmmhigh_o52_prompt/tools/lprof_npsu_run_0      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/miniapp_intel/miniqmc/runs/miniqmc_422_zmmhigh_o52_prompt/tools/lprof_npsu_run_0  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/miniapp_intel/miniqmc/runs/miniqmc_422_zmmhigh_o52_prompt/tools/lprof_npsu_run_0  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/miniapp_intel/miniqmc/runs/miniqmc_422_zmmhigh_o52_prompt/tools/lprof_npsu_run_0  #
