Help is available by moving the cursor above any symbol or by checking MAQAO website.
Total Time (s) | 72.61 | |
Profiled Time (s) | 71.18 | |
Time in analyzed loops (%) | 12.3 | |
Time in analyzed innermost loops (%) | 11.9 | |
Time in user code (%) | 12.8 | |
Compilation Options Score (%) | 100 | |
Array Access Efficiency (%) | 91.5 | |
|
Potential Speedups |
Perfect Flow Complexity | 1.00 | |
Perfect OpenMP + MPI + Pthread | 1.00 | |
Perfect OpenMP + MPI + Pthread + Perfect Load Distribution | 19.3 | |
No Scalar Integer | Potential Speedup | 1.01 | |
Nb Loops to get 80% | 6 | |
FP Vectorised | Potential Speedup | 1.03 | |
Nb Loops to get 80% | 9 | |
Fully Vectorised | Potential Speedup | 1.11 | |
Nb Loops to get 80% | 11 | |
FP Arithmetic Only | Potential Speedup | 1.03 | |
Nb Loops to get 80% | 6 | |
Source Object | Issue |
▼vmc.mov1– | |
○jastrow4e.f | |
○optci.f | |
○multideterminante.f | |
○determinant.f | |
○optorb.f | |
○optjas.f | |
○determinant_psit.f | |
○determinante.f | |
○scale_dist.f | |
○orbitals.f | |
○multiply_slmi_mderiv.f | |
○determinante_psit.f | |
○nonlpsi.f | |
○deriv_nonlpsi.f | |
○metrop_mov1_slat.f | |
○basis_fns.f | |
○deriv_jastrow4.f90 | |
○optwf_sr.f90 | |
○set_input_data.f90 | |
○splfit.f | |
○deriv_nonloc.f | |
○get_norbterm.f90 | |
○detsav.f | |
○distances.f | |
○nonloc.f | |
○slm.f90 | |
○multideterminant.f | |
Application | /home/kcamus/trex/champ/champ/bin/vmc.mov1 | | |
Timestamp | 2023-11-14 15:06:12 |
Universal Timestamp | 1699974372 |
Number of processes observed | 1 |
Number of threads observed | 129 |
Experiment Type | OpenMP; | | |
Machine | ip-172-31-68-94 | | |
Model Name | AMD EPYC 9R14 96-Core Processor | | |
Architecture | x86_64 |
Micro Architecture | ZEN_V4 |
Cache Size | 1024 KB |
Number of Cores | 96 |
OS Version | Linux 6.2.0-1015-aws #15~22.04.1-Ubuntu SMP Fri Oct 6 21:37:24 UTC 2023 | | |
Architecture used during static analysis | x86_64 |
Micro Architecture used during static analysis | ZEN_V4 |
Frequency Driver | acpi-cpufreq |
Frequency Governor | performance |
Huge Pages | madvise |
Hyperthreading | off |
Number of sockets | 2 |
Number of cores per socket | 96 |
Compilation Options | vmc.mov1: F90 Flang - 1.5 2017-05-01 '+flang -DTARGET_ARCHITECTURE=\"avx512\" -DVECTORIZATION=\"avx512\" -I/home/kcamus/trex/champ/champ/buildflang/src/module -I/home/kcamus/trex/champ/champ/buildflang/src/parser -march=native -O2 -cpp -mcmodel=large -ffree-line-length-none -g -fno-omit-frame-pointer -fPIC -D_MPI_ -DCLUSTER -ffixed-form -ffixed-line-length-132 -c -o -I/home/kcamus/openmpi/openmpi-5.0.0/_install/include -I/home/kcamus/openmpi/openmpi-5.0.0/_install/lib' | | |
Dataset | |
Run Command | <executable> -i vmc_optimization_15000.inp |
Number Processes | 1 |
Number Nodes | 1 |
Filter | {type = number ; value = 10 ; } |
Profile Start | {unit = none ; value = 0 ; } |