Help is available by moving the cursor above any
symbol or by checking MAQAO website.
| Metric | r0 | r1 | r2 | r3 | r4 | r5 | r6 | r7 | r8 | r9 | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Total Time (s) | 396.01 | 45.68 | 43.16 | 41.96 | 45.24 | 41.21 | 42.18 | 43.20 | 47.68 | 44.16 | |
| Max (Thread Active Time) (s) | 395.55 | 44.99 | 42.27 | 41.00 | 44.10 | 39.88 | 40.69 | 41.40 | 45.51 | 41.91 | |
| Average Active Time (s) | 395.50 | 44.40 | 41.18 | 39.89 | 42.54 | 37.82 | 37.71 | 37.78 | 41.00 | 36.83 | |
| Activity Ratio (%) | 99.9 | 97.3 | 95.7 | 95.3 | 94.2 | 92.0 | 89.6 | 87.6 | 86.2 | 83.5 | |
| Average number of active threads | 5.992 | 93.326 | 114.495 | 119.801 | 135.392 | 154.162 | 171.671 | 188.883 | 206.399 | 213.474 | |
| Affinity Stability (%) | 100.0 | 97.2 | 95.3 | 96.4 | 94.1 | 91.4 | 89.0 | 87.6 | 84.8 | 86.3 | |
| GFLOPS | 14.474 | 135.573 | 154.479 | 138.127 | 131.525 | 127.687 | 122.572 | 120.074 | 98.286 | 123.335 | |
| Time in analyzed loops (%) | 9.40 | 7.15 | 5.83 | 5.76 | 5.01 | 4.68 | 3.97 | 3.33 | 2.98 | 3.41 | |
| Time in analyzed innermost loops (%) | 9.40 | 7.15 | 5.83 | 5.76 | 5.01 | 4.68 | 3.97 | 3.33 | 2.98 | 3.41 | |
| Time in user code (%) | 10.5 | 8.57 | 7.35 | 7.30 | 6.50 | 6.12 | 5.39 | 4.71 | 4.31 | 4.74 | |
| Compilation Options Score (%) | 50.0 | 50.0 | 50.0 | 50.0 | 50.0 | 50.0 | 50.0 | 50.0 | 50.0 | 50.0 | |
| Array Access Efficiency (%) | 84.6 | 88.3 | 88.1 | 88.2 | 88.1 | 89.3 | 90.3 | 89.8 | 89.5 | 90.0 | |
| Potential Speedups | |||||||||||
| Perfect Flow Complexity | 1.00 | 1.01 | 1.01 | 1.01 | 1.01 | 1.01 | 1.01 | 1.01 | 1.01 | 1.01 | |
| Perfect OpenMP/MPI/Pthread/TBB | 1.01 | 1.21 | 1.28 | 1.31 | 1.31 | 1.30 | 1.32 | 1.31 | 1.27 | 1.27 | |
| Perfect OpenMP/MPI/Pthread/TBB + Perfect Load Distribution | 1.17 | 1.33 | 1.48 | 1.53 | 1.58 | 1.47 | 1.56 | 1.57 | 1.54 | 1.59 | |
| Scalability - Gap | 1.00 | 1.85 | 2.18 | 2.22 | 2.74 | 2.91 | 3.41 | 3.93 | 4.82 | 4.76 | |
| No Scalar Integer | Potential Speedup | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 |
| Nb Loops to get 80% | 2 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | |
| FP Vectorised | Potential Speedup | 1.01 | 1.02 | 1.01 | 1.01 | 1.01 | 1.01 | 1.01 | 1.01 | 1.01 | 1.01 |
| Nb Loops to get 80% | 4 | 5 | 5 | 4 | 4 | 4 | 3 | 3 | 3 | 3 | |
| Fully Vectorised | Potential Speedup | 1.08 | 1.06 | 1.05 | 1.05 | 1.04 | 1.04 | 1.03 | 1.03 | 1.02 | 1.03 |
| Nb Loops to get 80% | 9 | 8 | 7 | 7 | 6 | 6 | 5 | 5 | 4 | 4 | |
| Only FP Arithmetic | Potential Speedup | 1.01 | 1.01 | 1.01 | 1.01 | 1.01 | 1.01 | 1.01 | 1.01 | 1.01 | 1.01 |
| Nb Loops to get 80% | 3 | 3 | 2 | 2 | 2 | 2 | 2 | 1 | 1 | 1 | |
| Source Object | Issue |
|---|---|
| ▼exec | |
| ▼device_memcpy.f90 | |
| ○ | -march=x86-64 is used but it should be replaced by a more architecture specific option or -march=native. |
| ○ | -funroll-loops is missing. |
| ▼addusdens.f90 | |
| ○ | -march=x86-64 is used but it should be replaced by a more architecture specific option or -march=native. |
| ○ | -funroll-loops is missing. |
| ▼vloc_psi.f90 | |
| ○ | -march=x86-64 is used but it should be replaced by a more architecture specific option or -march=native. |
| ○ | -funroll-loops is missing. |
| ▼usnldiag.f90 | |
| ○ | -march=x86-64 is used but it should be replaced by a more architecture specific option or -march=native. |
| ○ | -funroll-loops is missing. |
| ▼thread_util.f90 | |
| ○ | -march=x86-64 is used but it should be replaced by a more architecture specific option or -march=native. |
| ○ | -funroll-loops is missing. |
| ▼fft_scatter_2d.f90 | |
| ○ | -march=x86-64 is used but it should be replaced by a more architecture specific option or -march=native. |
| ○ | -funroll-loops is missing. |
| ▼qvan2.f90 | |
| ○ | -march=x86-64 is used but it should be replaced by a more architecture specific option or -march=native. |
| ○ | -funroll-loops is missing. |
| ▼init_us_2_acc.f90 | |
| ○ | -march=x86-64 is used but it should be replaced by a more architecture specific option or -march=native. |
| ○ | -funroll-loops is missing. |
| ▼fft_helper_subroutines.f90 | |
| ○ | -march=x86-64 is used but it should be replaced by a more architecture specific option or -march=native. |
| ○ | -funroll-loops is missing. |
| ▼g_psi.f90 | |
| ○ | -march=x86-64 is used but it should be replaced by a more architecture specific option or -march=native. |
| ○ | -funroll-loops is missing. |
| ▼h_psi.f90 | |
| ○ | -march=x86-64 is used but it should be replaced by a more architecture specific option or -march=native. |
| ○ | -funroll-loops is missing. |
| ▼sum_band.f90 | |
| ○ | -march=x86-64 is used but it should be replaced by a more architecture specific option or -march=native. |
| ○ | -funroll-loops is missing. |
| ▼sort.f90 | |
| ○ | -march=x86-64 is used but it should be replaced by a more architecture specific option or -march=native. |
| ○ | -funroll-loops is missing. |
| ▼cegterg.f90 | |
| ○ | -march=x86-64 is used but it should be replaced by a more architecture specific option or -march=native. |
| ○ | -funroll-loops is missing. |
| r0 | r1 | r2 | r3 | r4 | r5 | r6 | r7 | r8 | r9 | |
|---|---|---|---|---|---|---|---|---|---|---|
| Application | /beegfs/hackathon/users/eoseret/qaas_runs_test/isix02.benchmarkcenter.megware.com/177-221-2605/qe/run/base_runs/defaults/gcc/exec | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
| Timestamp | 2026-02-27 20:21:58 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
| Experiment Type | MPI; | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
| Machine | isix02.benchmarkcenter.megware.com | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
| Architecture | x86_64 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
| Micro Architecture | GRANITE_RAPIDS | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
| Model Name | Intel(R) Xeon(R) 6980P | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
| Cache Size | 516096 KB | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
| Number of Cores | 128 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
| Maximal Frequency | 3.9 GHz | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
| OS Version | Linux 5.14.0-611.16.1.el9_7.x86_64 #1 SMP PREEMPT_DYNAMIC Mon Dec 22 03:40:39 EST 2025 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
| Architecture used during static analysis | x86_64 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
| Micro Architecture used during static analysis | GRANITE_RAPIDS | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
| Compilation Options | exec: GNU Fortran2008 14.2.0 -mtune=generic -march=x86-64 -g -O3 -fno-omit-frame-pointer -fcf-protection=none -fallow-argument-mismatch -fopenmp -J FFTXlib/src/mod/qe_fftx -fintrinsic-modules-path /cluster/comp/gcc/14.2.0/lib/gcc/x86_64-pc-linux-gnu/14.2.0/finclude -fpre-include=/usr/include/finclude/math-vector-fortran.h | same as r0 | exec: GNU Fortran2008 14.2.0 -mtune=generic -march=x86-64 -g -O3 -fno-omit-frame-pointer -fcf-protection=none -fallow-argument-mismatch -fopenmp -J Modules/mod/qe_modules -fintrinsic-modules-path /cluster/comp/gcc/14.2.0/lib/gcc/x86_64-pc-linux-gnu/14.2.0/finclude -fpre-include=/usr/include/finclude/math-vector-fortran.h | same as r2 | same as r2 | same as r2 | same as r2 | same as r2 | same as r2 | same as r2 |
| Number of processes observed | 6 | 96 | 120 | 126 | 144 | 168 | 192 | 216 | 240 | 256 |
| Number of threads observed | 6 | 96 | 120 | 126 | 144 | 168 | 192 | 216 | 240 | 256 |
| Frequency Driver | intel_pstate | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
| Frequency Governor | performance | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
| Huge Pages | always | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
| Hyperthreading | on | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
| Number of sockets | 2 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
| Number of cores per socket | 128 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
| MAQAO version | 2026.0.0-b | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
| MAQAO build | d53714498d38428ad6a75949c25b07c813f07f11::20260206-105209 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
| Comments | OV scalability run using gcc | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |