Help is available by moving the cursor above any symbol or by checking MAQAO website.
Metric | r0 | r1 | r2 | r3 | r4 | r5 | r6 | |
---|---|---|---|---|---|---|---|---|
Total Time (s) | 1.87 E3 | 968.05 | 500.34 | 265.25 | 143.52 | 85.11 | 48.71 | |
Profiled Time (s) | 1.87 E3 | 960.82 | 495.16 | 261.06 | 139.91 | 83.52 | 47.35 | |
Time in analyzed loops (%) | 94.1 | 93.1 | 91.8 | 90.4 | 85.4 | 74.0 | 69.0 | |
Time in analyzed innermost loops (%) | 86.4 | 84.9 | 83.5 | 81.6 | 76.9 | 66.2 | 61.0 | |
Time in user code (%) | 0 | 0.86 | 1.95 | 2.94 | 7.06 | 19.5 | 23.5 | |
Compilation Options Score (%) | 100 | 100 | 100 | 100 | 100 | 100 | 100 | |
Perfect Flow Complexity | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | |
Array Access Efficiency (%) | Not Available | Not Available | Not Available | Not Available | Not Available | Not Available | Not Available | |
Perfect OpenMP + MPI + Pthread | 1.00 | 1.00 | 1.00 | 1.00 | 1.01 | 1.00 | 1.01 | |
Perfect OpenMP + MPI + Pthread + Perfect Load Distribution | 1.00 | 1.00 | 1.00 | 1.00 | 1.01 | 1.01 | 1.03 | |
No Scalar Integer | Potential Speedup | 1.05 | 1.05 | 1.05 | 1.05 | 1.05 | 1.04 | 1.04 |
Nb Loops to get 80% | 9 | 10 | 11 | 12 | 12 | 12 | 12 | |
FP Vectorised | Potential Speedup | 1.02 | 1.02 | 1.02 | 1.02 | 1.01 | 1.01 | 1.01 |
Nb Loops to get 80% | 3 | 3 | 3 | 3 | 3 | 3 | 3 | |
Fully Vectorised | Potential Speedup | 1.05 | 1.06 | 1.06 | 1.06 | 1.06 | 1.05 | 1.05 |
Nb Loops to get 80% | 15 | 19 | 20 | 21 | 20 | 21 | 21 | |
Only FP Arithmetic | Potential Speedup | 1.07 | 1.08 | 1.08 | 1.09 | 1.08 | 1.07 | 1.07 |
Nb Loops to get 80% | 14 | 19 | 19 | 20 | 19 | 20 | 19 | |
Scalability - Gap | 1.00 | 1.04 | 1.07 | 1.14 | 1.23 | 1.46 | 1.67 |
Source Object | Issue |
---|---|
▼libgromacs_mpi.so.7.0.0 | |
▼lincs.cpp | |
○ | |
▼pbc.cpp | |
○ | |
▼domdec.cpp | |
○ | |
▼pme_redistribute.cpp | |
○ | |
▼fft5d.cpp | |
○ | |
▼impl_arm_sve_util_float.h | |
○ | |
▼calc_verletbuf.cpp | |
○ | |
▼threaded_force_buffer.cpp | |
○ | |
▼update.cpp | |
○ | |
▼pme_pp.cpp | |
○ | |
▼localtopology.cpp | |
○ | |
▼settle.cpp | |
○ | |
▼pme_solve.cpp | |
○ | |
▼pme_spread.cpp | |
○ | |
▼atomdata.cpp | |
○ | |
▼manage_threading.cpp | |
○ | |
▼kernel_prune.cpp | |
○ | |
▼pme_grid.cpp | |
○ | |
▼partition.cpp | |
○ | |
▼kernel_outer.h | |
○ | |
▼pairs.cpp | |
○ | |
▼pairlist.cpp | |
○ | |
▼sim_util.cpp | |
○ | |
▼grid.cpp | |
○ | |
▼md_support.cpp | |
○ | |
▼bonded.cpp | |
○ | |
▼domdec_constraints.cpp | |
○ | |
▼vec.h | |
○ | |
▼mdatoms.cpp | |
○ | |
▼pme_gather.cpp | |
○ | |
▼gmx_mpi | |
▼ | |
○ | -g is missing, it is needed to have more accurate reports. Other recommended flags are: -O2/-O3, -march=(target) |
r0 | r1 | r2 | r3 | r4 | r5 | r6 | |
---|---|---|---|---|---|---|---|
Application | /home/eoseret/GROMACS/build/gcc_2/bin/gmx_mpi | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
Timestamp | 2023-02-21 17:49:53 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
Experiment Type | MPI; | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
Machine | ip-172-31-8-114 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
Architecture | arm64 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
Micro Architecture | ARM_NEOVERSE_V1 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
Model Name | |||||||
Cache Size | |||||||
Number of Cores | |||||||
Maximal Frequency | 0 GHz | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
OS Version | Linux 5.15.0-1030-aws #34~20.04.1-Ubuntu SMP Tue Jan 24 15:16:39 UTC 2023 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
Architecture used during static analysis | arm64 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
Micro Architecture used during static analysis | ARM_NEOVERSE_V1 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
Compilation Options | libgromacs_mpi.so.7.0.0: GNU C++17 11.1.0 -march=armv8.2-a+sve -msve-vector-bits=256 -mlittle-endian -mabi=lp64 -g -O3 -O3 -std=c++17 -fno-omit-frame-pointer -fcf-protection=none -fPIC -fexcess-precision=fast -funroll-all-loops -fopenmp -fasynchronous-unwind-tables -fstack-protector-strong -fstack-clash-protection | same as r0 | same as r0 | same as r0 | same as r0 | libgromacs_mpi.so.7.0.0: GNU C++17 11.1.0 -march=armv8.2-a+sve -msve-vector-bits=256 -mlittle-endian -mabi=lp64 -g -O3 -O3 -std=c++17 -fno-omit-frame-pointer -fcf-protection=none -fPIC -fexcess-precision=fast -funroll-all-loops -fopenmp -fasynchronous-unwind-tables -fstack-protector-strong -fstack-clash-protection gmx_mpi: N/A | same as r0 |
Number of processes observed | 1 | 2 | 4 | 8 | 16 | 32 | 64 |
Number of threads observed | 1 | 2 | 4 | 8 | 16 | 32 | 64 |
MAQAO version | 2.16.3 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
MAQAO build | Build information not available | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |
Comments | GNU 11.1 (SIMD=SVE), AWS G3 (Neoverse V1), scalability | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 | same as r0 |