Help is available by moving the cursor above any symbol or by checking MAQAO website.
- r0: orig
- r1: compilers/clang_21
Metric | r0 | r1 |
---|
Total Time (s) | 62.74 | 62.41 |
Profiled Time (s) | 56.86 | 56.56 |
GFLOPS | 104.307 | 76.234 |
Time in analyzed loops (%) | 96.6 | 96.8 |
Time in analyzed innermost loops (%) | 96.5 | 96.7 |
Time in user code (%) | 96.6 | 96.8 |
Compilation Options Score (%) | 100 | 100 |
Array Access Efficiency (%) | 87.1 | 85.2 |
|
Potential Speedups |
Perfect Flow Complexity | 1.10 | 1.10 |
Perfect OpenMP + MPI + Pthread | 1.02 | 1.02 |
Perfect OpenMP + MPI + Pthread + Perfect Load Distribution | 1.04 | 1.04 |
No Scalar Integer | Potential Speedup | 1.02 | 1.04 |
Nb Loops to get 80% | 4 | 7 |
FP Vectorised | Potential Speedup | 1.13 | 1.09 |
Nb Loops to get 80% | 4 | 3 |
Fully Vectorised | Potential Speedup | 1.18 | 1.14 |
Nb Loops to get 80% | 5 | 4 |
Only FP Arithmetic | Potential Speedup | 1.18 | 1.21 |
Nb Loops to get 80% | 8 | 11 |
Source Object | Issue |
▼exec– | |
▼accelerate_kernel.f90-pp.f90– | |
○ | |
▼ideal_gas_kernel.f90-pp.f90– | |
○ | |
▼initialise_chunk_kernel.f90-pp.f90– | |
○ | |
▼viscosity_kernel.f90-pp.f90– | |
○ | |
▼advec_mom_kernel.f90-pp.f90– | |
○ | |
▼calc_dt_kernel.f90-pp.f90– | |
○ | |
▼build_field.f90-pp.f90– | |
○ | |
▼field_summary_kernel.f90-pp.f90– | |
○ | |
▼generate_chunk_kernel.f90-pp.f90– | |
○ | |
▼flux_calc_kernel.f90-pp.f90– | |
○ | |
▼PdV_kernel.f90-pp.f90– | |
○ | |
▼update_halo_kernel.f90-pp.f90– | |
○ | |
▼revert_kernel.f90-pp.f90– | |
○ | |
▼advec_cell_kernel.f90-pp.f90– | |
○ | |
▼reset_field_kernel.f90-pp.f90– | |
○ | |
Source Object | Issue |
▼exec– | |
▼accelerate_kernel.f90-pp.f90– | |
○ | |
▼ideal_gas_kernel.f90-pp.f90– | |
○ | |
▼initialise_chunk_kernel.f90-pp.f90– | |
○ | |
▼viscosity_kernel.f90-pp.f90– | |
○ | |
▼advec_mom_kernel.f90-pp.f90– | |
○ | |
▼update_halo_kernel.f90-pp.f90– | |
○ | |
▼build_field.f90-pp.f90– | |
○ | |
▼field_summary_kernel.f90-pp.f90– | |
○ | |
▼generate_chunk_kernel.f90-pp.f90– | |
○ | |
▼flux_calc_kernel.f90-pp.f90– | |
○ | |
▼advec_cell_kernel.f90-pp.f90– | |
○ | |
▼calc_dt_kernel.f90-pp.f90– | |
○ | |
▼revert_kernel.f90-pp.f90– | |
○ | |
▼PdV_kernel.f90-pp.f90– | |
○ | |
▼reset_field_kernel.f90-pp.f90– | |
○ | |
| r0 | r1 |
Experiment Name | | |
Application | /home/kcamus/qaas_runs/170-308-5670/intel/CloverLeafFC/run/oneview_runs/defaults/orig/exec | /home/kcamus/qaas_runs/170-308-5670/intel/CloverLeafFC/run/binaries/clang_21/exec |
Timestamp | 2023-12-20 15:34:02 | 2023-12-20 16:44:55 |
Experiment Type | MPI; OpenMP; | same as r0 |
Machine | ip-172-31-68-94 | same as r0 |
Architecture | x86_64 | same as r0 |
Micro Architecture | ZEN_V4 | same as r0 |
Model Name | AMD EPYC 9R14 96-Core Processor | same as r0 |
Cache Size | 1024 KB | same as r0 |
Number of Cores | 96 | same as r0 |
Maximal Frequency | 3.701953 GHz | same as r0 |
OS Version | Linux 6.2.0-1017-aws #17~22.04.1-Ubuntu SMP Fri Nov 17 21:07:13 UTC 2023 | same as r0 |
Architecture used during static analysis | x86_64 | same as r0 |
Micro Architecture used during static analysis | ZEN_V4 | same as r0 |
Compilation Options |
exec: F90 Flang - 1.5 2017-05-01 '+flang -I/home/kcamus/qaas_runs/170-308-5670/intel/CloverLeafFC/build/CloverLeafFC/CloverLeaf_ref/kernels -O3 -march=native -g -fno-omit-frame-pointer -fcf-protection=none -no-pie -fopenmp -c -o -I/home/kcamus/openmpi/openmpi-5.0.0/_install/include -I/home/kcamus/openmpi/openmpi-5.0.0/_install/lib' | exec: F90 Flang - 1.5 2017-05-01 '+flang -I/home/kcamus/qaas_runs/170-308-5670/intel/CloverLeafFC/build/CloverLeafFC/CloverLeaf_ref/kernels -Ofast -march=znver4 -mprefer-vector-width=512 -flto -g -fno-omit-frame-pointer -fcf-protection=none -no-pie -fopenmp -c -o -I/home/kcamus/openmpi/openmpi-5.0.0/_install/include -I/home/kcamus/openmpi/openmpi-5.0.0/_install/lib' |
Number of processes observed | 1 | same as r0 |
Number of threads observed | 192 | same as r0 |
Frequency Driver | acpi-cpufreq | same as r0 |
Frequency Governor | performance | same as r0 |
Huge Pages | madvise | same as r0 |
Hyperthreading | off | same as r0 |
Number of sockets | 2 | same as r0 |
Number of cores per socket | 96 | same as r0 |
MAQAO version | 2.18.1 | same as r0 |
MAQAO build | 577f2e430dc41154e3ac510c4b67111c38b3cbf1::20231218-170050 | same as r0 |
Comments | | same as r0 |