options

Executable Output

miniqmc git branch: OMP_offload
miniqmc git commit: 34c39aa17b79f2e7e5c41ff1896cb0847b88715a

number of ranks : 1, number of accelerators : 0
Number of orbitals/splines = 3072
Tile size = 3072
Number of tiles = 1
Number of electrons = 6144
Rmax = 1.7
AcceptanceRatio = 0.5
Iterations = 5
OpenMP threads = 52
Number of walkers per rank = 52

SPO coefficients size = 1572864000 bytes (1500 MB)
delayed update rank = 32
Using the reference implementation for Jastrow, 
determinant update, and distance table + einspline of the 
reference implementation 
================================== 

Use --enable-timers= command line option to increase or decrease level of timing information
Stack timer profile
Timer                                       Inclusive_time  Exclusive_time  Calls       Time_per_call
Setup                                          0.0504     0.0504              1       0.050403909
  ParticleSet:::update                         0.0000     0.0000              1       0.000007803
Total                                        198.2891    17.7271              1     198.289072703
  Diffusion                                   99.1012     0.1702              5      19.820239544
    Complete Updates                           0.6366     0.0001              5       0.127316558
      DeterminantRef::update                   0.6365     0.6365             10       0.063651006
    Current Gradient                           5.8368     0.1103          30720       0.000189998
      DeterminantRef::ratio                    5.6780     5.6780          30720       0.000184831
      OneBodyJastrowRef                        0.0274     0.0274          30720       0.000000892
      TwoBodyJastrowRef                        0.0210     0.0210          30720       0.000000684
    Kinetic Energy                             0.9840     0.9828              5       0.196799993
      OneBodyJastrowRef                        0.0008     0.0008              5       0.000150409
      TwoBodyJastrowRef                        0.0004     0.0004              5       0.000082190
    New Gradient                              29.9644     0.1311          30720       0.000975405
      DeterminantRef::ratio                    0.4485     0.4485          30720       0.000014600
      DeterminantRef::spovgl                  26.0315     1.3601          30720       0.000847379
        Single-Particle Orbitals              24.6714    24.6714          30720       0.000803106
      OneBodyJastrowRef                        0.3838     0.3838          30720       0.000012492
      TwoBodyJastrowRef                        2.9696     2.9696          30720       0.000096667
    ParticleSet:::acceptMove                  13.1821     0.0479          15371       0.000857597
      DTAAOMPTarget::update_e_e               12.9881    12.9881          15371       0.000844972
      DTABOMPTarget::update_ion_e              0.1462     0.1462          15371       0.000009509
    ParticleSet:::computeNewPosDT              3.0931     0.0545          30720       0.000100686
      DTAAOMPTarget::move_e_e                  2.7898     2.7898          30720       0.000090814
      DTABOMPTarget::move_ion_e                0.2487     0.2487          30720       0.000008096
    ParticleSet:::donePbyP                     0.0000     0.0000              5       0.000002918
    Update                                    45.2340     0.0412          15371       0.002942816
      DeterminantRef::update                  41.8797    41.8797          15371       0.002724589
      OneBodyJastrowRef                        0.0086     0.0086          15371       0.000000557
      TwoBodyJastrowRef                        3.3047     3.3047          15371       0.000214993
  Initialization                              11.7751     4.6636              1      11.775113315
    DeterminantRef::inverse                    2.1724     2.1724              2       1.086224912
    DeterminantRef::spovgl                     4.3202     0.1989              2       2.160097934
      Single-Particle Orbitals                 4.1213     4.1213           6144       0.000670790
    OneBodyJastrowRef                          0.0223     0.0223              1       0.022333579
    ParticleSet:::update                       0.4312     0.1113              2       0.215604593
      DTAAOMPTarget::evaluate_e_e              0.2997     0.2997              1       0.299728475
      DTABOMPTarget::evaluate_ion_e            0.0202     0.0004              1       0.020208187
        DTABOMPTarget::offload_ion_e           0.0198     0.0198              1       0.019763373
    TwoBodyJastrowRef                          0.1653     0.1653              1       0.165301417
  Pseudopotential                             69.6857     0.1675              5      13.937133733
    DeterminantRef::spoval                    58.8402     2.3166          10215       0.005760173
      Single-Particle Orbitals                56.5236    56.5236         122580       0.000461116
    OneBodyJastrowRef                          0.0931     0.0931          10215       0.000009115
    ParticleSet:::update                       8.4194     0.0327          10215       0.000824223
      DTABOMPTarget::evaluate_e_virtual        7.6853     0.0135          10215       0.000752358
        DTABOMPTarget::offload_e_virtual       7.6719     7.6719          10215       0.000751040
      DTABOMPTarget::evaluate_ion_virtual      0.7014     0.0136          10215       0.000068663
        DTABOMPTarget::offload_ion_virtual     0.6878     0.6878          10215       0.000067328
    TwoBodyJastrowRef                          2.1654     2.1654          10215       0.000211984

========== Throughput ============ 

Total throughput ( N_walkers * N_elec^3 / Total time ) = 6.08216e+10
Diffusion throughput ( N_walkers * N_elec^3 / Diffusion time ) = 1.21696e+11
Pseudopotential throughput ( N_walkers * N_elec^2 / Pseudopotential time ) = 2.81684e+07


* Info: Dumping samples (host skylake, process 2863407)
* Info: Dumping source info for callchain nodes (host skylake, process 2863407)
* Info: Building/writing metadata (host skylake)
* Info: Finished collect step (host skylake, process 2863407)

Your experiment path is /home/kcamus/miniapp_intel/miniqmc/runs/miniqmc_422_zmmhigh_o52_prompt/tools/lprof_npsu_run_0

To display your profiling results:
#########################################################################################################################################################
#    LEVEL    |     REPORT     |                                                        COMMAND                                                         #
#########################################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/kcamus/miniapp_intel/miniqmc/runs/miniqmc_422_zmmhigh_o52_prompt/tools/lprof_npsu_run_0      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/kcamus/miniapp_intel/miniqmc/runs/miniqmc_422_zmmhigh_o52_prompt/tools/lprof_npsu_run_0  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/kcamus/miniapp_intel/miniqmc/runs/miniqmc_422_zmmhigh_o52_prompt/tools/lprof_npsu_run_0  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/kcamus/miniapp_intel/miniqmc/runs/miniqmc_422_zmmhigh_o52_prompt/tools/lprof_npsu_run_0  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/kcamus/miniapp_intel/miniqmc/runs/miniqmc_422_zmmhigh_o52_prompt/tools/lprof_npsu_run_0      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/kcamus/miniapp_intel/miniqmc/runs/miniqmc_422_zmmhigh_o52_prompt/tools/lprof_npsu_run_0  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/kcamus/miniapp_intel/miniqmc/runs/miniqmc_422_zmmhigh_o52_prompt/tools/lprof_npsu_run_0  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/kcamus/miniapp_intel/miniqmc/runs/miniqmc_422_zmmhigh_o52_prompt/tools/lprof_npsu_run_0  #
#########################################################################################################################################################

×