Help is available by moving the cursor above any symbol or by checking MAQAO website.
- r0: run_0
- r1: omp_2_threads
- r2: omp_4_threads
- r3: omp_8_threads
- r4: omp_16_threads
- r5: omp_32_threads
- r6: omp_64_threads
- r7: omp_72_threads
- r8: omp_144_threads
Metric | r0 | r1 | r2 | r3 | r4 | r5 | r6 | r7 | r8 |
---|
Total Time (s) | 436.09 | 251.29 | 127.62 | 66.94 | 35.81 | 20.13 | 11.86 | 10.98 | 8.50 |
Profiled Time (s) | 435.87 | 251.07 | 127.48 | 66.86 | 34.68 | 19.46 | 11.29 | 10.54 | 7.94 |
Time in analyzed loops (%) | 99.6 | 98.9 | 98.8 | 97.2 | 96.3 | 89.6 | 87.5 | 85.2 | 54.0 |
Time in analyzed innermost loops (%) | 94.8 | 94.5 | 94.9 | 93.4 | 92.6 | 86.1 | 84.2 | 82.0 | 51.9 |
Time in user code (%) | 99.6 | 98.9 | 98.8 | 97.2 | 96.4 | 89.6 | 87.6 | 85.3 | 54.3 |
Compilation Options Score (%) | 75.0 | 75.0 | 75.0 | 75.0 | 75.0 | 75.0 | 75.0 | 75.0 | 75.0 |
Array Access Efficiency (%) | Not Available | Not Available | Not Available | Not Available | Not Available | Not Available | Not Available | Not Available | Not Available |
Scalability - Gap | 1.00 | 1.15 | 1.17 | 1.23 | 1.31 | 1.48 | 1.74 | 1.81 | 2.81 |
|
Potential Speedups |
Perfect Flow Complexity | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 |
Perfect OpenMP + MPI + Pthread | 1.00 | 1.00 | 1.01 | 1.02 | 1.02 | 1.10 | 1.13 | 1.12 | 1.42 |
Perfect OpenMP + MPI + Pthread + Perfect Load Distribution | 1.00 | 1.01 | 1.02 | 1.05 | 1.08 | 1.20 | 1.31 | 1.36 | 2.24 |
No Scalar Integer | Potential Speedup | 1.02 | 1.02 | 1.01 | 1.01 | 1.01 | 1.01 | 1.01 | 1.01 | 1.01 |
Nb Loops to get 80% | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
FP Vectorised | Potential Speedup | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 |
Nb Loops to get 80% | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
Fully Vectorised | Potential Speedup | 1.04 | 1.04 | 1.03 | 1.03 | 1.03 | 1.03 | 1.03 | 1.03 | 1.02 |
Nb Loops to get 80% | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
Only FP Arithmetic | Potential Speedup | 1.02 | 1.02 | 1.01 | 1.01 | 1.01 | 1.01 | 1.01 | 1.01 | 1.01 |
Nb Loops to get 80% | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
Source Object | Issue |
▼spmxv.exe– | |
▼ooo_cmdline.h– | |
○ | -funroll-loops is missing. |
▼ooo_cmdline.cpp– | |
○ | -funroll-loops is missing. |
▼main.cpp– | |
○ | -funroll-loops is missing. |
Source Object | Issue |
▼spmxv.exe– | |
▼ooo_cmdline.h– | |
○ | -funroll-loops is missing. |
▼ooo_cmdline.cpp– | |
○ | -funroll-loops is missing. |
▼main.cpp– | |
○ | -funroll-loops is missing. |
Source Object | Issue |
▼spmxv.exe– | |
▼ooo_cmdline.h– | |
○ | -funroll-loops is missing. |
▼ooo_cmdline.cpp– | |
○ | -funroll-loops is missing. |
▼main.cpp– | |
○ | -funroll-loops is missing. |
Source Object | Issue |
▼spmxv.exe– | |
▼ooo_cmdline.h– | |
○ | -funroll-loops is missing. |
▼ooo_cmdline.cpp– | |
○ | -funroll-loops is missing. |
▼main.cpp– | |
○ | -funroll-loops is missing. |
Source Object | Issue |
▼spmxv.exe– | |
▼ooo_cmdline.h– | |
○ | -funroll-loops is missing. |
▼ooo_cmdline.cpp– | |
○ | -funroll-loops is missing. |
▼main.cpp– | |
○ | -funroll-loops is missing. |
Source Object | Issue |
▼spmxv.exe– | |
▼ooo_cmdline.h– | |
○ | -funroll-loops is missing. |
▼ooo_cmdline.cpp– | |
○ | -funroll-loops is missing. |
▼main.cpp– | |
○ | -funroll-loops is missing. |
Source Object | Issue |
▼spmxv.exe– | |
▼ooo_cmdline.h– | |
○ | -funroll-loops is missing. |
▼ooo_cmdline.cpp– | |
○ | -funroll-loops is missing. |
▼main.cpp– | |
○ | -funroll-loops is missing. |
Source Object | Issue |
▼spmxv.exe– | |
▼ooo_cmdline.h– | |
○ | -funroll-loops is missing. |
▼ooo_cmdline.cpp– | |
○ | -funroll-loops is missing. |
▼main.cpp– | |
○ | -funroll-loops is missing. |
Source Object | Issue |
▼spmxv.exe– | |
▼ooo_cmdline.h– | |
○ | -funroll-loops is missing. |
▼ooo_cmdline.cpp– | |
○ | -funroll-loops is missing. |
▼main.cpp– | |
○ | -funroll-loops is missing. |
| r0 | r1 | r2 | r3 | r4 | r5 | r6 | r7 | r8 |
Experiment Name | | | | | | | | | |
Application | ./spmxv.exe | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
Timestamp | 2024-07-03 14:49:56 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
Experiment Type | Sequential | OpenMP; | same as r1 | same as r1 | same as r1 | same as r1 | same as r1 | same as r1 | same as r1 |
Machine | p11-grace01.cs.it4i.cz | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
Architecture | | | | | | | | | |
Micro Architecture | | | | | | | | | |
Model Name | | | | | | | | | |
Cache Size | | | | | | | | | |
Number of Cores | | | | | | | | | |
Maximal Frequency | 3.42 GHz | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
OS Version | Linux 5.14.0-362.18.1.el9_3.aarch64 #1 SMP PREEMPT_DYNAMIC Thu Jan 25 07:56:00 UTC 2024 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
Architecture used during static analysis | aarch64 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
Micro Architecture used during static analysis | ARM_NEOVERSE_V1 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
Compilation Options |
spmxv.exe: GNU C++17 12.1.0 -mlittle-endian -mabi=lp64 -mcpu=demeter+crypto+rcpc+sve2-sm4+sve2-aes+sve2-sha3+nodotprod+noprofile+norng+nomemtag+nopredres+nopauth -g -O3 -fopenmp | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
Number of processes observed | 1 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
Number of threads observed | 1 | 2 | 4 | 8 | 16 | 32 | 64 | 72 | 144 |
Frequency Driver | cppc_cpufreq | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
Frequency Governor | performance | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
Huge Pages | always | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
Hyperthreading | off | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
Number of sockets | 2 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
Number of cores per socket | 72 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
MAQAO version | 2.20.3 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
MAQAO build | bfc89c69b7374f41fdba9d7e1e206b0cf5900829::20240621-165222 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
Comments | | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |