Help is available by moving the cursor above any symbol or by checking MAQAO website.
- r0: gcc_o3_m80_size512-512-768_m4-4-5/
- r1: gcc_ofast_m80_size512-512-768_m4-4-5/
- r2: acfl_o3_m80_size512-512-768_m4-4-5/
- r3: acfl_ofast_m80_size512-512-768_m4-4-5/
Metric | r0 | r1 | r2 | r3 |
---|
Total Time (s) | 53.34 | 53.41 | 55.21 | 55.23 |
Max (Thread Active Time) (s) | 51.59 | 51.62 | 53.21 | 53.30 |
Average Active Time (s) | 51.36 | 51.41 | 53.00 | 53.06 |
Activity Ratio (%) | 96.2 | 96.3 | 95.9 | 96.0 |
Average number of active threads | 77.034 | 77.001 | 76.787 | 76.855 |
Affinity Stability (%) | 16.1 | 15.2 | 18.7 | 18.2 |
Time in analyzed loops (%) | 93.5 | 93.9 | 94.3 | 93.9 |
Time in analyzed innermost loops (%) | 39.5 | 37.3 | 78.2 | 83.8 |
Time in user code (%) | 93.5 | 93.9 | 94.3 | 93.9 |
Compilation Options Score (%) | 75.0 | 75.0 | 100 | 100 |
Array Access Efficiency (%) | 37.1 | 29.7 | 57.7 | 58.0 |
|
Potential Speedups |
Perfect Flow Complexity | 1.00 | 1.00 | 1.00 | 1.00 |
Perfect OpenMP + MPI + Pthread | 1.06 | 1.05 | 1.05 | 1.06 |
Perfect OpenMP + MPI + Pthread + Perfect Load Distribution | 1.07 | 1.07 | 1.06 | 1.07 |
No Scalar Integer | Potential Speedup | 2.25 | 2.68 | 1.72 | 1.44 |
Nb Loops to get 80% | 3 | 2 | 3 | 3 |
FP Vectorised | Potential Speedup | 1.26 | 1.23 | 1.36 | 1.31 |
Nb Loops to get 80% | 2 | 1 | 2 | 1 |
Fully Vectorised | Potential Speedup | 1.53 | 1.52 | 2.03 | 1.98 |
Nb Loops to get 80% | 4 | 3 | 3 | 3 |
Only FP Arithmetic | Potential Speedup | 1.63 | 1.67 | 1.55 | 1.45 |
Nb Loops to get 80% | 2 | 3 | 3 | 3 |
Source Object | Issue |
▼lbc– | |
▼lb_init.F90– | |
○ | -funroll-loops is missing. |
▼mpl_set.F90– | |
○ | -funroll-loops is missing. |
▼tools.F90– | |
○ | -funroll-loops is missing. |
▼lbc.F90– | |
○ | -funroll-loops is missing. |
▼lbm_functions.F90– | |
○ | -funroll-loops is missing. |
Source Object | Issue |
▼lbc– | |
▼lb_init.F90– | |
○ | -funroll-loops is missing. |
▼mpl_set.F90– | |
○ | -funroll-loops is missing. |
▼tools.F90– | |
○ | -funroll-loops is missing. |
▼lbc.F90– | |
○ | -funroll-loops is missing. |
▼lbm_functions.F90– | |
○ | -funroll-loops is missing. |
Source Object | Issue |
▼lbc– | |
▼lb_init.F90– | |
○ | |
▼mpl_set.F90– | |
○ | |
▼tools.F90– | |
○ | |
▼lbc.F90– | |
○ | |
▼lbm_functions.F90– | |
○ | |
Source Object | Issue |
▼lbc– | |
▼lbm_functions.F90– | |
○ | |
▼mpl_set.F90– | |
○ | |
▼tools.F90– | |
○ | |
▼lbc.F90– | |
○ | |
▼lb_init.F90– | |
○ | |
| r0 | r1 | r2 | r3 |
Experiment Name | | | | |
Application | ./../lbc/lbc | same as r0 | same as r0 | same as r0 |
Timestamp | 2024-11-28 15:00:52 | 2024-11-28 14:35:04 | 2024-11-28 15:21:52 | 2024-11-28 15:36:32 |
Experiment Type | Sequential | same as r0 | same as r0 | same as r0 |
Machine | turpancomp0 | turpancomp1 | same as r0 | same as r0 |
Architecture | aarch64 | same as r0 | same as r0 | same as r0 |
Micro Architecture | ARM_NEOVERSE_N1 | same as r0 | same as r0 | same as r0 |
Model Name | | | | |
Cache Size | | | | |
Number of Cores | | | | |
Maximal Frequency | 3 GHz | same as r0 | same as r0 | same as r0 |
OS Version | Linux 4.18.0-477.27.1.el8_8.aarch64 #1 SMP Thu Aug 31 11:00:23 EDT 2023 | same as r0 | same as r0 | same as r0 |
Architecture used during static analysis | aarch64 | same as r0 | same as r0 | same as r0 |
Micro Architecture used during static analysis | ARM_NEOVERSE_N1 | same as r0 | same as r0 | same as r0 |
Compilation Options |
lbc: GNU Fortran2008 11.2.0 -mlittle-endian -mabi=lp64 -march=armv8.2-a+crypto+fp16+rcpc+dotprod+ssbs -g -O3 -O3 -fintrinsic-modules-path /usr/local/arm/gcc-11.2.0_Generic-AArch64_RHEL-8_aarch64-linux/bin/../lib/gcc/aarch64-linux-gnu/11.2.0/finclude -fpre-include=/usr/include/finclude/math-vector-fortran.h | lbc: GNU Fortran2008 11.2.0 -mlittle-endian -mabi=lp64 -march=armv8.2-a+crypto+fp16+rcpc+dotprod+ssbs -g -Ofast -Ofast -fintrinsic-modules-path /usr/local/arm/gcc-11.2.0_Generic-AArch64_RHEL-8_aarch64-linux/bin/../lib/gcc/aarch64-linux-gnu/11.2.0/finclude -fpre-include=/usr/include/finclude/math-vector-fortran.h | lbc: Arm F90 F90 Flang - 1.5 2017-05-01 flang -O3 -g -mcpu=native -D D3Q19 -D COMBINE_STREAM_COLLIDE -D LAYOUT_LIJK -D USE_MPI -D VERBOSE -D SENDRECV -D COMM_REDUCED -O3 -c -I /usr/local/openmpi/arm/4.1.4-cpu/include -I /usr/local/openmpi/arm/4.1.4-cpu/lib | lbc: Arm F90 F90 Flang - 1.5 2017-05-01 flang -Ofast -g -mcpu=native -D D3Q19 -D COMBINE_STREAM_COLLIDE -D LAYOUT_LIJK -D USE_MPI -D VERBOSE -D SENDRECV -D COMM_REDUCED -Ofast -c -I /usr/local/openmpi/arm/4.1.4-cpu/include -I /usr/local/openmpi/arm/4.1.4-cpu/lib |
Number of processes observed | 80 | same as r0 | same as r0 | same as r0 |
Number of threads observed | 80 | same as r0 | same as r0 | same as r0 |
Frequency Driver | cppc_cpufreq | same as r0 | same as r0 | same as r0 |
Frequency Governor | performance | same as r0 | same as r0 | same as r0 |
Huge Pages | never | same as r0 | same as r0 | same as r0 |
Hyperthreading | off | same as r0 | same as r0 | same as r0 |
Number of sockets | 1 | same as r0 | same as r0 | same as r0 |
Number of cores per socket | 80 | same as r0 | same as r0 | same as r0 |
MAQAO version | 2.20.12 | same as r0 | same as r0 | same as r0 |
MAQAO build | 62b64aff226fe590d95d07d98d47e2315a70c860::20241127-211231 | same as r0 | same as r0 | same as r0 |
Comments | | same as r0 | same as r0 | same as r0 |