Help is available by moving the cursor above any symbol or by checking MAQAO website.
Total Time (s) | 50.31 | |
Profiled Time (s) | 46.33 | |
Time in analyzed loops (%) | 89.0 | |
Time in analyzed innermost loops (%) | 79.0 | |
Time in user code (%) | 91.3 | |
Compilation Options Score (%) | 100 | |
Perfect Flow Complexity | 1.02 | |
Array Access Efficiency (%) | Not Available | |
GFLOPS | 0.0 | |
Perfect OpenMP + MPI + Pthread | 1.03 | |
Perfect OpenMP + MPI + Pthread + Perfect Load Distribution | 1.15 | |
No Scalar Integer | Potential Speedup | 1.18 | |
Nb Loops to get 80% | 17 | |
FP Vectorised | Potential Speedup | 1.59 | |
Nb Loops to get 80% | 20 | |
Fully Vectorised | Potential Speedup | 1.67 | |
Nb Loops to get 80% | 23 | |
FP Arithmetic Only | Potential Speedup | 1.76 | |
Nb Loops to get 80% | 23 | |
Source Object | Issue |
▼libgromacs_mpi.so.7– | |
○fft5d.cpp | |
○threaded_force_buffer.cpp | |
○stl_vector.h | |
○pme_gather.cpp | |
○listed_forces.cpp | |
○partition.cpp | |
○manage_threading.cpp | |
○kernel_prune.cpp | |
○pairs.cpp | |
○vec.h | |
○update.cpp | |
○md_support.cpp | |
○pme.cpp | |
○kernel_common.cpp | |
○mdatoms.cpp | |
○lincs.cpp | |
○pbc.cpp | |
○constr.cpp | |
○pme_grid.cpp | |
○localtopology.cpp | |
○kerneldispatch.cpp | |
○pme_solve.cpp | |
○pme_spread.cpp | |
○calc_verletbuf.cpp | |
○fft_fftw3.cpp | |
○bonded.cpp | |
○pairlist.cpp | |
○sim_util.cpp | |
○grid.cpp | |
○settle.cpp | |
○kernel_outer.h | |
○domdec_constraints.cpp | |
○redistribute.cpp | |
○impl_arm_sve_util_float.h | |
○atomdata.cpp | |
Application | /home/eoseret/GROMACS/build/gcc_2/bin/gmx_mpi | | |
Timestamp | 2023-08-08 09:18:53 |
Universal Timestamp | 1691486333 |
Number of processes observed | 1 |
Number of threads observed | 52 |
Experiment Type | MPI; OpenMP; | | |
Machine | ip-172-31-47-199 | | |
Architecture | aarch64 |
Micro Architecture | ARM_NEOVERSE_V1 |
OS Version | Linux 5.15.0-1039-aws #44~20.04.1-Ubuntu SMP Thu Jun 22 12:21:08 UTC 2023 | | |
Architecture used during static analysis | aarch64 |
Micro Architecture used during static analysis | ARM_NEOVERSE_V1 |
Frequency Driver | NA |
Frequency Governor | NA |
Huge Pages | madvise |
Hyperthreading | off |
Number of sockets | 1 |
Number of cores per socket | 64 |
Compilation Options | libgromacs_mpi.so.7: GNU C++17 11.1.0 -march=armv8.2-a+sve -msve-vector-bits=256 -mlittle-endian -mabi=lp64 -g -O3 -O3 -std=c++17 -fno-omit-frame-pointer -fcf-protection=none -fPIC -fexcess-precision=fast -funroll-all-loops -fopenmp -fasynchronous-unwind-tables -fstack-protector-strong -fstack-clash-protection | | |
Comments | GNU g++ 12.2.0 (SIMD=SVE), AWS G3 (Neoverse V1), 10000 steps, 52 cores | | |
Dataset | |
Run Command | <executable> mdrun -s ion_channel.tpr -nsteps 10000 -pin on -deffnm gcc |
MPI Command | mpirun -n <number_processes> --bind-to core |
Number Processes | 1 |
Number Nodes | 1 |
Number Processes per Nodes | 1 |
Filter | Not Used |
Profile Start | Not Used |
Maximal Path Number | 4 |