Help is available by moving the cursor above any 
 symbol or by checking MAQAO website.
| Metric | r0 | r1 | r2 | r3 | r4 | r5 | r6 | |
|---|---|---|---|---|---|---|---|---|
| Total Time (s) | 225.21 | 147.41 | 110.82 | 90.11 | 81.00 | 75.87 | 76.91 | |
| Profiled Time (s) | 190.08 | 112.28 | 75.18 | 54.27 | 45.08 | 40.59 | 40.95 | |
| Time in analyzed loops (%) | 27.4 | 25.6 | 21.6 | 18.5 | 15.7 | 14.9 | 13.5 | |
| Time in analyzed innermost loops (%) | 23.7 | 22.3 | 19.0 | 16.3 | 14.2 | 13.8 | 13.0 | |
| Time in user code (%) | 24.1 | 22.6 | 19.4 | 16.7 | 14.3 | 13.9 | 13.1 | |
| Compilation Options Score (%) | 72.9 | 72.8 | 73.9 | 75.5 | 77.8 | 82.1 | 88.0 | |
| Array Access Efficiency (%) | 90.5 | 91.0 | 90.0 | 90.4 | 91.1 | 91.4 | 92.2 | |
| Scalability - Gap | 1.00 | 1.31 | 1.97 | 3.20 | 5.75 | 8.76 | 17.76 | |
| Potential Speedups | ||||||||
| Perfect Flow Complexity | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | |
| Perfect OpenMP + MPI + Pthread | 1.00 | 1.00 | 1.05 | 1.04 | 1.02 | 1.01 | 1.01 | |
| Perfect OpenMP + MPI + Pthread + Perfect Load Distribution | 1.00 | 1.17 | 1.54 | 2.13 | 3.30 | 4.08 | 4.72 | |
| No Scalar Integer | Potential Speedup | 1.03 | 1.03 | 1.02 | 1.02 | 1.02 | 1.02 | 1.02 | 
| Nb Loops to get 80% | 2 | 2 | 2 | 2 | 2 | 2 | 1 | |
| FP Vectorised | Potential Speedup | 1.07 | 1.06 | 1.05 | 1.04 | 1.03 | 1.03 | 1.03 | 
| Nb Loops to get 80% | 8 | 8 | 8 | 8 | 7 | 6 | 4 | |
| Fully Vectorised | Potential Speedup | 1.19 | 1.18 | 1.15 | 1.12 | 1.10 | 1.09 | 1.08 | 
| Nb Loops to get 80% | 9 | 9 | 9 | 8 | 7 | 6 | 5 | |
| Only FP Arithmetic | Potential Speedup | 1.13 | 1.12 | 1.10 | 1.08 | 1.07 | 1.06 | 1.06 | 
| Nb Loops to get 80% | 4 | 4 | 4 | 4 | 4 | 4 | 3 | |
| OpenMP perfectly balanced | Potential Speedup | 1.00 | 1.00 | 1.03 | 1.02 | 1.01 | 1.01 | 1.01 | 
| Nb Loops to get 80% | 1 | 7 | 3 | 3 | 3 | 3 | 3 | |
| Source Object | Issue | 
|---|---|
| ▼bench_jastrow | |
| ▼ | |
| ○ | -g is missing for some functions (possibly ones added by the compiler), it is needed to have more accurate reports. Other recommended flags are: -O2/-O3, -march=(target) | 
| ▼libqmckl.so.0.0.0 | |
| ▼qmckl_distance_f.F90 | |
| ○ | |
| ▼qmckl_jastrow_champ_f.F90 | |
| ○ | |
| ▼qmckl_jastrow_champ.c | |
| ○ | 
| r0 | r1 | r2 | r3 | r4 | r5 | r6 | |
|---|---|---|---|---|---|---|---|
| Experiment Name | m1o1 | m1o1 | m1o1 | m1o1 | m1o1 | m1o1 | m1o1 | 
| Application | ./../qmckl_bench/build/bench_jastrow | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | 
| Timestamp | 2024-02-13 11:58:20 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | 
| Experiment Type | Sequential | OpenMP; | same as r1 | same as r1 | same as r1 | same as r1 | same as r1 | 
| Machine | skylake | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | 
| Architecture | x86_64 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | 
| Micro Architecture | SKYLAKE | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | 
| Model Name | Intel(R) Xeon(R) Platinum 8170 CPU @ 2.10GHz | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | 
| Cache Size | 36608 KB | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | 
| Number of Cores | 26 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | 
| Maximal Frequency | 2.1 GHz | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | 
| OS Version | Linux 6.5.7-arch1-1 #1 SMP PREEMPT_DYNAMIC Tue, 10 Oct 2023 21:10:21 +0000 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | 
| Architecture used during static analysis | x86_64 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | 
| Micro Architecture used during static analysis | SKYLAKE | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | 
| Compilation Options | bench_jastrow:    libqmckl.so.0.0.0: Intel(R) C Intel(R) 64 Compiler Classic for applications running on Intel(R) 64, Version 2021.8.0 Build 20221119_000000 -I. -I/home/kcamus/comparative/qmckl/qmckl -I./include -I./src -I./include -I/home/kcamus/comparative/qmckl/qmckl/src -I/home/kcamus/comparative/qmckl/qmckl/include -I/home/kcamus/comparative/qmckl/qmckl/share/qmckl/test_data/ -I/home/kcamus/comparative/qmckl/trexio/_install/include -DHAVE_CONFIG_H -DQMCKL_TEST_DIR=\"/home/kcamus/comparative/qmckl/qmckl/share/qmckl/test_data/\" -march=native -ip -Ofast -ftz -finline -fopenmp -mkl=sequential -g -fno-omit-frame-pointer -fopenmp -MT src/qmckl_jastrow_champ.lo -MD -MP -MF src/.deps/qmckl_jastrow_champ.Tpo -c -fPIC -DPIC -o src/.libs/qmckl_jastrow_champ.o  | libqmckl.so.0.0.0: Intel(R) C Intel(R) 64 Compiler Classic for applications running on Intel(R) 64, Version 2021.8.0 Build 20221119_000000  -I. -I/home/kcamus/comparative/qmckl/qmckl -I./include -I./src -I./include -I/home/kcamus/comparative/qmckl/qmckl/src -I/home/kcamus/comparative/qmckl/qmckl/include -I/home/kcamus/comparative/qmckl/qmckl/share/qmckl/test_data/ -I/home/kcamus/comparative/qmckl/trexio/_install/include -DHAVE_CONFIG_H -DQMCKL_TEST_DIR=\"/home/kcamus/comparative/qmckl/qmckl/share/qmckl/test_data/\" -march=native -ip -Ofast -ftz -finline -fopenmp -mkl=sequential -g -fno-omit-frame-pointer -fopenmp -MT src/qmckl_jastrow_champ.lo -MD -MP -MF src/.deps/qmckl_jastrow_champ.Tpo -c -fPIC -DPIC -o src/.libs/qmckl_jastrow_champ.o  bench_jastrow:  | same as r0 | same as r0 | same as r1 | same as r0 | same as r1 | 
| Number of processes observed | 1 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | 
| Number of threads observed | 1 | 2 | 4 | 8 | 16 | 26 | 52 | 
| Frequency Driver | intel_cpufreq | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | 
| Frequency Governor | performance | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | 
| Huge Pages | always | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | 
| Hyperthreading | off | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | 
| Number of sockets | 2 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | 
| Number of cores per socket | 26 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | 
| MAQAO version | 2.19.0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | 
| MAQAO build | b37ee48e971324d4eaf9054a5a16e1bfd5003152::20240201-180403 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | 
| Comments | - | - | - | - | - | - | - |