Help is available by moving the cursor above any symbol or by checking MAQAO website.
▶Filter Information
19 threads covering less than 1% of profiled time ( = Max (Thread Active Time)) were discarded, cumulating 0.20 seconds CPU time. You can adjust the threshold below which a thread will be discarded with the thread-filter-threshold option.
Global Metrics
Total Time (s)
36.34
Max (Thread Active Time) (s)
31.72
Average Active Time (s)
31.63
Activity Ratio (%)
87.0
Average number of active threads
55.715
Affinity Stability
30.7
Time in analyzed loops (%)
85.2
Time in analyzed innermost loops (%)
57.5
Time in user code (%)
85.2
Compilation Options Score (%)
75.0
Array Access Efficiency (%)
49.7
Potential Speedups
Perfect Flow Complexity
1.00
Perfect OpenMP + MPI + Pthread
1.00
Perfect OpenMP + MPI + Pthread + Perfect Load Distribution
1.18
No Scalar Integer
Potential Speedup
1.24
Nb Loops to get 80%
2
FP Vectorised
Potential Speedup
1.18
Nb Loops to get 80%
1
Fully Vectorised
Potential Speedup
1.31
Nb Loops to get 80%
3
FP Arithmetic Only
Potential Speedup
1.54
Nb Loops to get 80%
3
CQA Potential Speedups Summary
Loop Based Profile⏎
Innermost Loop Based Profile⏎
Application Categorization⏎
Compilation Options⏎
Source Object
Issue
▼lbc–
○lb_init.F90
-funroll-loops is missing.
○mpl_set.F90
-funroll-loops is missing.
○tools.F90
-funroll-loops is missing.
○lbc.F90
-funroll-loops is missing.
○lbm_functions.F90
-funroll-loops is missing.
Loop Path Count Profile⏎
Cumulated Speedup If No Scalar Integer⏎
Cumulated Speedup If FP Vectorized⏎
Cumulated Speedup If Fully Vectorized⏎
Cumulated Speedup If FP Arithmetic Only⏎
Experiment Summary
Application
./../lbc/lbc
Timestamp
2024-11-27 15:48:19
Universal Timestamp
1732722499
Number of processes observed
64
Number of threads observed
64
Experiment Type
MPI; OpenMP;
Machine
ip-172-31-42-13
Architecture
aarch64
Micro Architecture
ARM_NEOVERSE_V1
OS Version
Linux 6.8.0-1016-aws #17~22.04.2-Ubuntu SMP Thu Sep 26 18:55:31 UTC 2024