* Warning: perf-events measurements are not allowed on node fob1: selecting the no-perf engine. Try:
sudo sysctl -w kernel.perf_event_paranoid=1 (*)
To persist across reboots:
sudo sh -c 'echo kernel.perf_event_paranoid=1 >> /etc/sysctl.d/local.conf' (*)
(*) requires sudo permissions. If missing, contact administrators.
=1 allows both kernel+user-space measurements (=2: only user-space)
* Warning: The 'no-perf' engine is feature-limited and suffers higher overhead than other engines. It should be used only when perf-events are not available on the running Linux kernel - for instance with WSL1 (Windows Subsystem for Linux version 1) - or when the paranoid level (as displayed by 'sysctl kernel.perf_event_paranoid') cannot be lowered to 2 or less.
* Info: Process launched (host fob1, process 1044677)/home/eoseret/miniqmc/build_icc/bin/OV3_miniqmc_icc_zmm_16T/binaries/miniqmc: /lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by /home/eoseret/miniqmc/build_icc/bin/OV3_miniqmc_icc_zmm_16T/binaries/miniqmc)
/home/eoseret/miniqmc/build_icc/bin/OV3_miniqmc_icc_zmm_16T/binaries/miniqmc: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.34' not found (required by /home/eoseret/miniqmc/build_icc/bin/OV3_miniqmc_icc_zmm_16T/binaries/miniqmc)
/home/eoseret/miniqmc/build_icc/bin/OV3_miniqmc_icc_zmm_16T/binaries/miniqmc: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.32' not found (required by /home/eoseret/miniqmc/build_icc/bin/OV3_miniqmc_icc_zmm_16T/binaries/miniqmc)
* Warning: Process/thread 1044677 (host fob1) exited with 1
* Info: Process finished (host fob1, process 1044677)
* Warning: Run too short with the given sampling rate (only 12 time-related samples collected). Results may lack precision. Rerun with a longer workload or with sampling-rate=high.
* Warning: Restricted access to kernel symbols:
to see kernel functions in profiling results, reprofile as root
or execute sudo sysctl -w kernel.kptr_restrict=0.
To make kptr_restrict=0 persist across reboots:
sudo sh -c "echo kernel.kptr_restrict=0 >> /etc/sysctl.d/local.conf"
* Info: Dumping samples (host fob1, process 1044677)
* Info: Dumping source info for callchain nodes (host fob1, process 1044677)
* Info: Building/writing metadata (host fob1)
* Info: Finished collect step (host fob1, process 1044677)
Your experiment path is /home/eoseret/miniqmc/build_icc/bin/OV3_miniqmc_icc_zmm_16T/tools/lprof_npsu_run_0
To display your profiling results:
##############################################################################################################################################
# LEVEL | REPORT | COMMAND #
##############################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/miniqmc/build_icc/bin/OV3_miniqmc_icc_zmm_16T/tools/lprof_npsu_run_0 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/miniqmc/build_icc/bin/OV3_miniqmc_icc_zmm_16T/tools/lprof_npsu_run_0 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/miniqmc/build_icc/bin/OV3_miniqmc_icc_zmm_16T/tools/lprof_npsu_run_0 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/miniqmc/build_icc/bin/OV3_miniqmc_icc_zmm_16T/tools/lprof_npsu_run_0 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/miniqmc/build_icc/bin/OV3_miniqmc_icc_zmm_16T/tools/lprof_npsu_run_0 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/miniqmc/build_icc/bin/OV3_miniqmc_icc_zmm_16T/tools/lprof_npsu_run_0 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/miniqmc/build_icc/bin/OV3_miniqmc_icc_zmm_16T/tools/lprof_npsu_run_0 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/miniqmc/build_icc/bin/OV3_miniqmc_icc_zmm_16T/tools/lprof_npsu_run_0 #
##############################################################################################################################################
* Info: Selecting the 'perf-low-ppn' engine for node skylake
* Info: Process launched (host skylake, process 665276)miniqmc git branch: OMP_offload
miniqmc git commit: de45b04eb021c4b57ba6f4bee8f563c614d11135
number of ranks : 1, number of accelerators : 0
Number of orbitals/splines = 1536
Tile size = 1536
Number of tiles = 1
Number of electrons = 3072
Rmax = 1.7
AcceptanceRatio = 0.5
Iterations = 5
OpenMP threads = 16
Number of walkers per rank = 16
SPO coefficients size = 786432000 bytes (750 MB)
delayed update rank = 32
Using SoA distance table, Jastrow + einspline,
and determinant update.
==================================
Use --enable-timers= command line option to increase or decrease level of timing information
Stack timer profile
Timer Inclusive_time Exclusive_time Calls Time_per_call
Setup 0.0537 0.0537 1 0.053681569
ParticleSet:::update 0.0000 0.0000 1 0.000000866
Total 33.0407 0.0658 1 33.040690145
Diffusion 11.4269 0.0225 5 2.285370413
Complete Updates 0.0674 0.0000 5 0.013485747
Determinant::update 0.0674 0.0674 10 0.006741063
Current Gradient 0.4576 0.0153 15360 0.000029789
Determinant::ratio 0.4351 0.4351 15360 0.000028328
OneBodyJastrow 0.0043 0.0043 15360 0.000000280
TwoBodyJastrow 0.0028 0.0028 15360 0.000000184
Kinetic Energy 0.1036 0.1034 5 0.020722681
OneBodyJastrow 0.0001 0.0001 5 0.000026190
TwoBodyJastrow 0.0001 0.0001 5 0.000016489
New Gradient 6.4318 0.0159 15360 0.000418738
Determinant::ratio 0.0387 0.0387 15360 0.000002522
Determinant::spovgl 6.1085 0.1364 15360 0.000397690
Single-Particle Orbitals 5.9721 5.9721 15360 0.000388811
OneBodyJastrow 0.0246 0.0246 15360 0.000001603
TwoBodyJastrow 0.2440 0.2440 15360 0.000015886
ParticleSet:::acceptMove 1.4542 0.0072 7611 0.000191070
DTAAOMPTarget::update_e_e 1.4318 1.4318 7611 0.000188121
DTABOMPTarget::update_ion_e 0.0153 0.0153 7611 0.000002009
ParticleSet:::computeNewPosDT 0.2940 0.0097 15360 0.000019139
DTAAOMPTarget::move_e_e 0.2494 0.2494 15360 0.000016239
DTABOMPTarget::move_ion_e 0.0348 0.0348 15360 0.000002266
ParticleSet:::donePbyP 0.0000 0.0000 5 0.000001335
Update 2.5958 0.0085 7611 0.000341055
Determinant::update 2.3212 2.3212 7611 0.000304981
OneBodyJastrow 0.0017 0.0017 7611 0.000000221
TwoBodyJastrow 0.2644 0.2644 7611 0.000034733
Initialization 2.0492 0.4443 1 2.049213818
Determinant::inverse 0.2552 0.2552 2 0.127594488
Determinant::spovgl 1.1992 0.0436 2 0.599601763
Single-Particle Orbitals 1.1556 1.1556 3072 0.000376180
OneBodyJastrow 0.0037 0.0037 1 0.003701610
ParticleSet:::update 0.1083 0.0166 2 0.054139388
DTAAOMPTarget::evaluate_e_e 0.0894 0.0894 1 0.089437313
DTABOMPTarget::evaluate_ion_e 0.0022 0.0002 1 0.002234304
DTABOMPTarget::offload_ion_e 0.0021 0.0021 1 0.002080841
TwoBodyJastrow 0.0386 0.0386 1 0.038551685
Pseudopotential 19.4988 0.0286 5 3.899767766
Determinant::spoval 18.4254 0.0200 5359 0.003438215
Single-Particle Orbitals 18.4054 18.4054 5359 0.003434492
OneBodyJastrow 0.0227 0.0227 5359 0.000004239
ParticleSet:::update 0.7620 0.0078 5359 0.000142182
DTABOMPTarget::evaluate_e_virtual 0.6664 0.0033 5359 0.000124354
DTABOMPTarget::offload_e_virtual 0.6631 0.6631 5359 0.000123730
DTABOMPTarget::evaluate_ion_virtual 0.0878 0.0028 5359 0.000016380
DTABOMPTarget::offload_ion_virtual 0.0850 0.0850 5359 0.000015866
TwoBodyJastrow 0.2602 0.2602 5359 0.000048546
========== Throughput ============
Total throughput ( N_walkers * N_elec^3 / Total time ) = 1.40389e+10
Diffusion throughput ( N_walkers * N_elec^3 / Diffusion time ) = 4.05935e+10
Pseudopotential throughput ( N_walkers * N_elec^2 / Pseudopotential time ) = 7.74379e+06
* Info: Process finished (host skylake, process 665276)
* Info: Dumping samples (host skylake, process 665276)
* Info: Dumping source info for callchain nodes (host skylake, process 665276)
* Info: Building/writing metadata (host skylake)
* Info: Finished collect step (host skylake, process 665276)
Your experiment path is /home/eoseret/miniqmc/build_icc/bin/OV3_miniqmc_icc_zmm_16T/tools/lprof_npsu_run_0
To display your profiling results:
##############################################################################################################################################
# LEVEL | REPORT | COMMAND #
##############################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/miniqmc/build_icc/bin/OV3_miniqmc_icc_zmm_16T/tools/lprof_npsu_run_0 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/miniqmc/build_icc/bin/OV3_miniqmc_icc_zmm_16T/tools/lprof_npsu_run_0 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/miniqmc/build_icc/bin/OV3_miniqmc_icc_zmm_16T/tools/lprof_npsu_run_0 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/miniqmc/build_icc/bin/OV3_miniqmc_icc_zmm_16T/tools/lprof_npsu_run_0 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/miniqmc/build_icc/bin/OV3_miniqmc_icc_zmm_16T/tools/lprof_npsu_run_0 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/miniqmc/build_icc/bin/OV3_miniqmc_icc_zmm_16T/tools/lprof_npsu_run_0 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/miniqmc/build_icc/bin/OV3_miniqmc_icc_zmm_16T/tools/lprof_npsu_run_0 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/miniqmc/build_icc/bin/OV3_miniqmc_icc_zmm_16T/tools/lprof_npsu_run_0 #
##############################################################################################################################################