Help is available by moving the cursor above any symbol or by checking MAQAO website.
- r0: baseline
- r1: locus440
Metric | r0 | r1 |
---|
Total Time (s) | 376.72 | 13.79 |
Profiled Time (s) | 374.93 | 13.61 |
Time in analyzed loops (%) | 43.1 | 40.2 |
Time in analyzed innermost loops (%) | 43.0 | 38.6 |
Time in user code (%) | 43.1 | 40.3 |
Compilation Options Score (%) | 100 | 100 |
Array Access Efficiency (%) | 74.9 | 55.6 |
|
Potential Speedups |
Perfect Flow Complexity | 1.00 | 1.00 |
Perfect OpenMP + MPI + Pthread | 1.11 | 1.16 |
Perfect OpenMP + MPI + Pthread + Perfect Load Distribution | 2.32 | 2.51 |
No Scalar Integer | Potential Speedup | 1.00 | 1.00 |
Nb Loops to get 80% | 1 | 1 |
FP Vectorised | Potential Speedup | 1.00 | 1.00 |
Nb Loops to get 80% | 1 | 1 |
Fully Vectorised | Potential Speedup | 1.00 | 1.44 |
Nb Loops to get 80% | 2 | 1 |
Only FP Arithmetic | Potential Speedup | 1.61 | 1.44 |
Nb Loops to get 80% | 1 | 1 |
Source Object | Issue |
▼permute3d_1.omp.exe– | |
▼permute3d_1.omp.cpp– | |
○ | |
Source Object | Issue |
▼permute3d_1.locus440.exe– | |
▼permute3d_1.locus440.cpp– | |
○ | |
| r0 | r1 |
Experiment Name | | |
Application | ./permute3d_1.omp.exe | ./permute3d_1.locus440.exe |
Timestamp | 2024-11-07 10:23:41 | 2024-11-07 10:05:02 |
Experiment Type | OpenMP; | same as r0 |
Machine | itp06.benchmarkcenter.megware.com | same as r0 |
Architecture | x86_64 | same as r0 |
Micro Architecture | ICELAKE_SP | same as r0 |
Model Name | Intel(R) Xeon(R) Platinum 8368 CPU @ 2.40GHz | same as r0 |
Cache Size | 58368 KB | same as r0 |
Number of Cores | 38 | same as r0 |
Maximal Frequency | 3.4 GHz | same as r0 |
OS Version | Linux 5.14.0-427.18.1.el9_4.x86_64 #1 SMP PREEMPT_DYNAMIC Tue May 28 06:27:02 EDT 2024 | same as r0 |
Architecture used during static analysis | x86_64 | same as r0 |
Micro Architecture used during static analysis | ICELAKE_SP | same as r0 |
Compilation Options |
permute3d_1.omp.exe: clang based Intel(R) oneAPI DPC++/C++ Compiler 2024.0.0 (2024.0.0.20231017) --driver-mode=g++ --intel -I . -o permute3d_1.omp.o -c -g -O3 -mprefer-vector-width=512 -march=native -Wall -Wno-unknown-pragmas -fiopenmp -fiopenmp -fopenmp-targets=spir64 permute3d_1.omp.cpp -fveclib=SVML -fheinous-gnu-extensions | permute3d_1.locus440.exe: clang based Intel(R) oneAPI DPC++/C++ Compiler 2024.0.0 (2024.0.0.20231017) --driver-mode=g++ --intel -I . -o permute3d_1.locus440.o -c -g -O3 -mprefer-vector-width=512 -march=native -Wall -Wno-unknown-pragmas -fiopenmp -fiopenmp -fopenmp-targets=spir64 permute3d_1.locus440.cpp -fveclib=SVML -fheinous-gnu-extensions |
Number of processes observed | 1 | same as r0 |
Number of threads observed | 76 | same as r0 |
Frequency Driver | intel_pstate | same as r0 |
Frequency Governor | performance | same as r0 |
Huge Pages | always | same as r0 |
Hyperthreading | on | same as r0 |
Number of sockets | 2 | same as r0 |
Number of cores per socket | 38 | same as r0 |
MAQAO version | 2.20.10 | same as r0 |
MAQAO build | 4ac1b2f5b5fdb6964b480406b6b2a13ea0924e38::20241106-170444 | same as r0 |
Comments | | same as r0 |