Help is available by moving the cursor above any symbol or by checking MAQAO website.
▶Filter Information
There is no filter information to display
Global Metrics
Total Time (s)
108.72
Max (Thread Active Time) (s)
107.19
Average Active Time (s)
106.73
Activity Ratio (%)
99.1
Average number of active threads
51.045
Affinity Stability (%)
87.6
Time in analyzed loops (%)
86.2
Time in analyzed innermost loops (%)
85.9
Time in user code (%)
86.6
Compilation Options Score (%)
16.6
Array Access Efficiency (%)
67.0
Potential Speedups
Iterations Count
1.00
Perfect Flow Complexity
1.00
Perfect OpenMP/MPI/Pthread/TBB
1.08
Perfect OpenMP/MPI/Pthread/TBB + Perfect Load Distribution
1.16
No Scalar Integer
Potential Speedup
1.00
Nb Loops to get 80%
2
FP Vectorised
Potential Speedup
1.33
Nb Loops to get 80%
1
Fully Vectorised
Potential Speedup
2.12
Nb Loops to get 80%
1
Data In L1 Cache
Potential Speedup
2.20
Nb Loops to get 80%
1
FP Arithmetic Only
Potential Speedup
1.01
Nb Loops to get 80%
7
CQA Potential Speedups Summary
Average Active Threads Count⏎
Loop Based Profile⏎
Innermost Loop Based Profile⏎
Application Categorization⏎
Compilation Options⏎
Source Object
Issue
▼llama-cli–
▼–
○
-g is missing for some functions (possibly ones added by the compiler), it is needed to have more accurate reports. Other recommended flags are: -O2/-O3, -march=(target)
○
-O2, -O3 or -Ofast is missing.
○
-march=(target) is missing.
▼libggml-cpu.so–
▼sgemm.cpp–
○
-g is missing for some functions (possibly ones added by the compiler), but debug locations are available. Some analysis may be inaccurate.
○
-O2, -O3 or -Ofast is missing.
○
-x(target) or -ax(target) is missing.
▼quants.c–
○
-g is missing for some functions (possibly ones added by the compiler), but debug locations are available. Some analysis may be inaccurate.
○
-O2, -O3 or -Ofast is missing.
○
-x(target) or -ax(target) is missing.
▼vec.cpp–
○
-g is missing for some functions (possibly ones added by the compiler), but debug locations are available. Some analysis may be inaccurate.
○
-O2, -O3 or -Ofast is missing.
○
-x(target) or -ax(target) is missing.
▼ggml-cpu.c–
○
-g is missing for some functions (possibly ones added by the compiler), but debug locations are available. Some analysis may be inaccurate.
○
-O2, -O3 or -Ofast is missing.
○
-x(target) or -ax(target) is missing.
▼ops.cpp–
○
-g is missing for some functions (possibly ones added by the compiler), but debug locations are available. Some analysis may be inaccurate.
○
-O2, -O3 or -Ofast is missing.
○
-x(target) or -ax(target) is missing.
Loop Iteration Count Profile⏎
Loop Path Count Profile⏎
Cumulated Speedup If No Scalar Integer⏎
Cumulated Speedup If FP Vectorized⏎
Cumulated Speedup If Fully Vectorized⏎
Cumulated Speedup If Data In L1⏎
Cumulated Speedup If FP Arithmetic Only⏎
Experiment Summary
Application
../build_icx_no_rpath/bin/llama-cli
Timestamp
2025-10-20 13:30:22
Universal Timestamp
1760959822
Number of processes observed
1
Number of threads observed
52
Experiment Type
OpenMP;
Machine
skylake
Model Name
Intel(R) Xeon(R) Platinum 8170 CPU @ 2.10GHz
Architecture
x86_64
Micro Architecture
SKYLAKE
Cache Size
36608 KB
Number of Cores
26
OS Version
Linux 6.17.2-arch1-1 #1 SMP PREEMPT_DYNAMIC Sun, 12 Oct 2025 12:45:18 +0000