Loops
▶mmq.cpp: 1597 - 173.14 %
| Run orig_default | Run gcc_default | Run aocc_default | Run icx_10 | Run gcc_4 | Run aocc_6 | ||||||||||||||||||||||||||||||||||||
| Loop Source Regions | Loop Source Regions |
| Loop Source Regions | Loop Source Regions | Loop Source Regions |
| Loop Source Regions | ||||||||||||||||||||||||||||||||||
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 405 | 24.65 | 24.14 | 86.35 | 87.67 | 85.7 | 89.25 | 400 | 24.73 | 24.14 | 86.79 | 87.67 | 85.7 | 89.52 | ||||||||||||||||||||||||||||
| No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | Sum on 1 analyzed binary loop (libggml-cpu.so - 405) | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | Sum on 1 analyzed binary loop (libggml-cpu.so - 400) | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | ||||||||||||||||||||||||||||||||||||
| Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | ||||||||||||||||||||||||||||||
| Data Access Issues | Data Access Issues | ||||||||||||||||||||||||||||||||||||||||
| Presence of constant non-unit stride data access | 1 | Presence of constant non-unit stride data access | 1 | ||||||||||||||||||||||||||||||||||||||
| Presence of indirect access | 1 | Presence of indirect access | 1 | ||||||||||||||||||||||||||||||||||||||
| Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 | ||||||||||||||||||||||||||||||||||||||
| Vectorization Roadblocks | Vectorization Roadblocks | ||||||||||||||||||||||||||||||||||||||||
| Presence of constant non-unit stride data access | 1 | Presence of constant non-unit stride data access | 1 | ||||||||||||||||||||||||||||||||||||||
| Presence of indirect access | 1 | Presence of indirect access | 1 | ||||||||||||||||||||||||||||||||||||||
| Inefficient Vectorization | Inefficient Vectorization | ||||||||||||||||||||||||||||||||||||||||
| Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 | ||||||||||||||||||||||||||||||||||||||
▶mmq.cpp: 1570 - 170.17 %
| Run orig_default | Run gcc_default | Run aocc_default | Run icx_10 | Run gcc_4 | Run aocc_6 | ||||||||||||||||||||||||||||||||||||
| Loop Source Regions | Loop Source Regions | Loop Source Regions |
| Loop Source Regions | Loop Source Regions | Loop Source Regions |
| ||||||||||||||||||||||||||||||||||
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 563 | 24.50 | 23.97 | 84.92 | 0 | 0 | 89.54 | 511 | 24.58 | 23.92 | 85.25 | 0 | 0 | 90.49 | ||||||||||||||||||||||||||||
| No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | Sum on 1 analyzed binary loop (libggml-cpu.so - 563) | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | Sum on 1 analyzed binary loop (libggml-cpu.so - 511) | ||||||||||||||||||||||||||||||||||||
| Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | ||||||||||||||||||||||||||||||
▶mmq.cpp: 1573 - 169.87 %
| Run orig_default | Run gcc_default | Run aocc_default | Run icx_10 | Run gcc_4 | Run aocc_6 | ||||||||||||||||||||||||||||||||||||
| Loop Source Regions |
| Loop Source Regions | Loop Source Regions | Loop Source Regions |
| Loop Source Regions | Loop Source Regions | ||||||||||||||||||||||||||||||||||
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 452 | 24.60 | 24.02 | 84.98 | 0 | 0 | 89.29 | 322 | 24.63 | 24.01 | 84.90 | 0 | 0 | 89.34 | ||||||||||||||||||||||||||||
| Sum on 1 analyzed binary loop (libggml-cpu.so - 452) | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | Sum on 1 analyzed binary loop (libggml-cpu.so - 322) | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | ||||||||||||||||||||||||||||||||||||
| Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | ||||||||||||||||||||||||||||||
▶<unknown>: 0 - 6.31 %
| Run orig_default | Run gcc_default | Run aocc_default | Run icx_10 | Run gcc_4 | Run aocc_6 | ||||||||||||||||||||||||||||||||||||
| Loop Source Regions | Loop Source Regions | Loop Source Regions | Loop Source Regions | Loop Source Regions | Loop Source Regions | ||||||||||||||||||||||||||||||||||||
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1900 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 1407 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 1138 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 2016 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 1610 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 2360 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 |
| 2328 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 3040 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 2873 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 2316 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 3089 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 2254 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 |
| 2321 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 3036 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 2870 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 996 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 3102 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 2605 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 |
| 2552 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 2738 | 0.08 | 0.00 | 0.01 | 0 | 0 | 0 | 1097 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 285 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 2802 | 0.08 | 0.00 | 0.01 | 0 | 0 | 0 | 2268 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 |
| 1939 | 0.00 | 0.00 | 0.00 | 0 | 0 | NA | 1409 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 1256 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 1062 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 1802 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 2316 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 |
| 1021 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 3463 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 978 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 1702 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 3558 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 1101 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 |
| 2334 | 0.02 | 0.00 | 0.00 | 0 | 0 | 0 | 551 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 2749 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 2111 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 3029 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 2503 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 |
| 2146 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 3208 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 2486 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 2114 | 0.02 | 0.00 | 0.00 | 0 | 0 | 0 | 3257 | 0.02 | 0.00 | 0.00 | 0 | 0 | 0 | 2489 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 |
| 2448 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 3178 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 2733 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 1061 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 2734 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 926 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 |
| 2167 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 1369 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 2461 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 926 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 3098 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 1952 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 |
| 2437 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 2964 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 1033 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 1953 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 2144 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 1970 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 |
| 2176 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 2675 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 2871 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 1962 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 1468 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 1136 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 |
| 1195 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 2938 | 0.02 | 0.00 | 0.00 | 0 | 0 | 0 | 2478 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 1974 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 3232 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 896 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 |
| 2438 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 2936 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 2492 | 0.01 | 0.00 | 0.00 | 0 | 0 | 2.7 | 1955 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 2803 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 2499 | 0.02 | 0.00 | 0.00 | 0 | 0 | 0 |
| 2322 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 1549 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 2891 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 1981 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 1844 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 2262 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 |
| 2435 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 3149 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 2215 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 1719 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 3003 | 0.02 | 0.00 | 0.00 | 0 | 0 | 0 | 1054 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 |
| 1901 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 3153 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 1140 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 1715 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 3448 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 972 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 |
| 2155 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 2939 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 2896 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 1027 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 3289 | 0.02 | 0.00 | 0.00 | 0 | 0 | 0 | 2602 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 |
| 2174 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 3357 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 2744 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 2201 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 3004 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 445 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 |
| 484 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 2674 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 2275 | 0.06 | 0.00 | 0.01 | 0 | 0 | 0 | 2110 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 3095 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 2236 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 |
| 504 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 1422 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 530 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 2204 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 3228 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 2042 | 0.05 | 0.00 | 0.00 | 0 | 0 | 0 |
| 483 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 727 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 459 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 1983 | 0.02 | 0.00 | 0.00 | 0 | 0 | 0 | 3006 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 390 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 |
| 1868 | 0.05 | 0.00 | 0.00 | 0 | 0 | 0 | 3228 | 0.15 | 0.00 | 0.01 | 0 | 0 | 0 | 877 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 2116 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 752 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 391 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 |
| 2004 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 31 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 325 | 0.00 | 0.00 | 0.00 | 0 | 0 | 56.8 | 1647 | 0.07 | 0.00 | 0.01 | 0 | 0 | 0 | 2377 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 462 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 |
| 2003 | 0.01 | 0.00 | 0.00 | 0 | 0 | 131.92 | 471 | 0.01 | 0.00 | 0.00 | 0 | 0 | 150.61 | 991 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 443 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 3381 | 0.12 | 0.00 | 0.01 | 0 | 0 | 0 | 763 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 |
| 452 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 706 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 734 | 0.00 | 0.00 | 0.00 | 0 | 0 | 604.97 | 413 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 651 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 333 | 0.01 | 0.00 | 0.00 | 0 | 0 | 182.13 |
| 804 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 510 | 0.03 | 0.01 | 0.03 | 0 | 0 | 1507.18 | 416 | 0.14 | 0.10 | 0.37 | 0 | 0 | 0 | 463 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 677 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 816 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 |
| 700 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 636 | 0.09 | 0.00 | 0.01 | 0 | 0 | 520.25 | 1790 | 0.15 | 0.00 | 0.01 | 0 | 0 | 410.49 | 1738 | 0.03 | 0.00 | 0.00 | 0 | 0 | 182.88 | 441 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 83 | 0.00 | 0.00 | 0.00 | 0 | 0 | NA |
| 365 | 0.01 | 0.00 | 0.00 | 0 | 0 | 75.34 | 1348 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 1264 | 0.24 | 0.08 | 0.27 | 0 | 0 | 1885.36 | 633 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 218 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 1747 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 |
| 161 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 312 | 0.02 | 0.00 | 0.01 | 0 | 0 | 0 | 2137 | 0.02 | 0.00 | 0.01 | 0 | 0 | 600.88 | 734 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 221 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 387 | 0.00 | 0.00 | 0.00 | 0 | 0 | NA |
| 639 | 0.17 | 0.10 | 0.37 | 0 | 0 | 0 | 313 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 107 | 0.00 | 0.00 | 0.00 | 0 | 0 | NA | 316 | 0.01 | 0.00 | 0.00 | 0 | 0 | 188.54 | 66 | 0.00 | 0.00 | 0.00 | 0 | 0 | NA | 2029 | 0.02 | 0.01 | 0.02 | 0 | 0 | 0.1 |
| 122 | 0.03 | 0.01 | 0.03 | 0 | 0 | 19.23 | 52 | 0.04 | 0.01 | 0.05 | 0 | 0 | 10.36 | 727 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 19 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 535 | 0.01 | 0.00 | 0.00 | 0 | 0 | 904.33 | 1279 | 0.25 | 0.08 | 0.29 | 0 | 0 | 1768.76 |
| 813 | 0.20 | 0.00 | 0.01 | 0 | 0 | 206.73 | 459 | 0.12 | 0.07 | 0.25 | 0 | 0 | 0 | 739 | 0.01 | 0.00 | 0.00 | 0 | 0 | 286.73 | 696 | 0.17 | 0.00 | 0.02 | 0 | 0 | 190.48 | 615 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 689 | 0.01 | 0.00 | 0.00 | 0 | 0 | 4060.6 |
| 2176 | 0.01 | 0.00 | 0.00 | 0 | 0 | 301.62 | 1684 | 0.00 | 0.00 | 0.00 | 0 | 0 | NA | 737 | 0.01 | 0.00 | 0.00 | 0 | 0 | 245.94 | 1089 | 0.07 | 0.03 | 0.12 | 0 | 0 | 1429.5 | 530 | 0.01 | 0.00 | 0.00 | 0 | 0 | 5235.02 | 358 | 0.14 | 0.11 | 0.38 | 0 | 0 | 0 |
| 116 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 70 | 0.00 | 0.00 | 0.00 | 0 | 0 | NA | 108 | 0.04 | 0.01 | 0.03 | 0 | 0 | 43.77 | 1096 | 0.16 | 0.00 | 0.01 | 0 | 0 | 8015.27 | 532 | 0.01 | 0.00 | 0.00 | 0 | 0 | 552.3 | 2047 | 0.14 | 0.07 | 0.24 | 0 | 0 | 880.52 |
| 451 | 0.01 | 0.00 | 0.00 | 0 | 0 | 385.3 | 1343 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 741 | 0.01 | 0.00 | 0.00 | 0 | 0 | 4142.9 | 1518 | 0.17 | 0.00 | 0.01 | 0 | 0 | 410.51 | 1307 | 0.07 | 0.00 | 0.00 | 0 | 0 | 1259.28 | 687 | 0.01 | 0.00 | 0.00 | 0 | 0 | 361.27 |
| 1573 | 0.01 | 0.00 | 0.00 | 0 | 0 | 7691.55 | 514 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 410 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 1512 | 0.00 | 0.00 | 0.00 | 0 | 0 | 301.43 | 537 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 1647 | 0.03 | 0.00 | 0.00 | 0 | 0 | 2609.14 |
| 2191 | 0.00 | 0.00 | 0.00 | 0 | 0 | 496.41 | 67 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 2473 | 0.00 | 0.00 | 0.00 | 0 | 0 | 151.44 | 321 | 0.01 | 0.00 | 0.00 | 0 | 0 | 276.44 | 536 | 0.00 | 0.00 | 0.00 | 0 | 0 | 8433.28 | 683 | 0.00 | 0.00 | 0.00 | 0 | 0 | NA |
| 6 | 0.02 | 0.00 | 0.01 | 0 | 0 | 0 | 980 | 0.02 | 0.00 | 0.02 | 0 | 0 | 220.27 | 848 | 0.09 | 0.00 | 0.01 | 0 | 0 | 432.01 | 2080 | 0.00 | 0.00 | 0.00 | 0 | 0 | NA | 63 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 2051 | 0.00 | 0.00 | 0.00 | 0 | 0 | NA |
| 1812 | 0.02 | 0.00 | 0.02 | 0 | 0 | 303.17 | 985 | 0.03 | 0.01 | 0.03 | 0 | 0 | 669.16 | 5 | 0.01 | 0.00 | 0.01 | 0 | 0 | 0 | 52 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 1627 | 0.16 | 0.09 | 0.31 | 0 | 0 | 1364.7 | 352 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 |
| 547 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 1593 | 0.01 | 0.00 | 0.00 | 0 | 0 | 204.39 | 2469 | 0.14 | 0.07 | 0.24 | 0 | 0 | 884.41 | 541 | 0.14 | 0.11 | 0.38 | 0 | 0 | 0 | 729 | 0.15 | 0.00 | 0.01 | 0 | 0 | 229.44 | 681 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 |
| 1837 | 0.03 | 0.01 | 0.03 | 0 | 0 | 305.41 | 1599 | 0.15 | 0.08 | 0.28 | 0 | 0 | 926.47 | 2145 | 0.03 | 0.01 | 0.04 | 0 | 0 | 415.06 | 6 | 0.01 | 0.00 | 0.01 | 0 | 0 | 0 | 1367 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 2762 | 0.24 | 0.01 | 0.02 | 0 | 0 | 1218.17 |
| 405 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 1283 | 0.00 | 0.00 | 0.00 | 0 | 0 | 11749.07 | 2451 | 0.03 | 0.01 | 0.02 | 0 | 0 | 10.72 | 282 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 48 | 0.04 | 0.01 | 0.05 | 0 | 0 | 13.18 | 1289 | 0.13 | 0.00 | 0.01 | 0 | 0 | 7659.04 |
| 553 | 0.02 | 0.00 | 0.00 | 0 | 0 | 287.87 | 513 | 0.01 | 0.00 | 0.00 | 0 | 0 | 3916.64 | 2442 | 0.01 | 0.00 | 0.00 | 0 | 0 | 423.96 | 1956 | 0.02 | 0.01 | 0.02 | 0 | 0 | 2.51 | 1306 | 0.01 | 0.00 | 0.00 | 0 | 0 | 5955.1 | 2024 | 0.00 | 0.00 | 0.00 | 0 | 0 | 285.65 |
| 555 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | 952 | 0.14 | 0.00 | 0.01 | 0 | 0 | 7814.8 | 1791 | 0.01 | 0.00 | 0.00 | 0 | 0 | 5480.17 | 1952 | 0.01 | 0.00 | 0.00 | 0 | 0 | 337.63 | 3 | 0.02 | 0.00 | 0.01 | 0 | 0 | 0 | 84 | 0.03 | 0.01 | 0.02 | 0 | 0 | 23.24 |
| 557 | 0.01 | 0.00 | 0.01 | 0 | 0 | 93.6 | 1284 | 0.20 | 0.00 | 0.02 | 0 | 0 | 364.06 | 560 | 0.00 | 0.00 | 0.00 | 0 | 0 | 778.49 | 1964 | 0.21 | 0.11 | 0.39 | 0 | 0 | 1404.31 | 1004 | 0.02 | 0.00 | 0.02 | 0 | 0 | 188.18 | 1836 | 0.02 | 0.00 | 0.02 | 0 | 0 | 241.36 |
| 2180 | 0.02 | 0.01 | 0.02 | 0 | 0 | 1.94 | 1596 | 0.03 | 0.01 | 0.02 | 0 | 0 | 0 | 736 | 0.01 | 0.00 | 0.00 | 0 | 0 | 86.5 | 57 | 0.00 | 0.00 | 0.00 | 0 | 0 | 50.27 | 655 | 0.05 | 0.00 | 0.01 | 0 | 0 | 614.8 | 76 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 |
| 2189 | 0.19 | 0.11 | 0.39 | 0 | 0 | 1293.28 | 3 | 0.01 | 0.00 | 0.01 | 0 | 0 | 0 | 738 | 0.00 | 0.00 | 0.00 | 0 | 0 | 454.13 | 281 | 0.02 | 0.00 | 0.01 | 0 | 0 | 4.43 | 589 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | 1646 | 0.01 | 0.00 | 0.00 | 0 | 0 | 5025.11 |
| 548 | 0.02 | 0.00 | 0.00 | 0 | 0 | 2459.7 | 948 | 0.15 | 0.08 | 0.27 | 0 | 0 | 2127.28 | 1268 | 0.13 | 0.00 | 0.01 | 0 | 0 | 9120.76 | 1725 | 0.04 | 0.01 | 0.03 | 0 | 0 | 298.84 | 1624 | 0.02 | 0.01 | 0.02 | 0 | 0 | 1.38 | 814 | 0.08 | 0.00 | 0.01 | 0 | 0 | 334.6 |
| 1578 | 0.18 | 0.00 | 0.01 | 0 | 0 | 381.47 | 509 | 0.01 | 0.00 | 0.01 | 0 | 0 | 374.37 | 946 | 0.16 | 0.00 | 0.02 | 0 | 0 | 185.65 | 58 | 0.04 | 0.01 | 0.03 | 0 | 0 | 7.82 | 974 | 0.13 | 0.00 | 0.01 | 0 | 0 | 8366.06 | 685 | 0.00 | 0.00 | 0.00 | 0 | 0 | 474.25 |
| 1120 | 0.09 | 0.03 | 0.12 | 0 | 0 | 1553.25 | 1601 | 0.01 | 0.00 | 0.00 | 0 | 0 | 161.92 | 2632 | 0.00 | 0.00 | 0.00 | 0 | 0 | 302.48 | 1967 | 0.01 | 0.00 | 0.00 | 0 | 0 | 1017.95 | 972 | 0.08 | 0.04 | 0.15 | 0 | 0 | 1555.78 | 1840 | 0.04 | 0.01 | 0.05 | 0 | 0 | 332.76 |
| 1127 | 0.13 | 0.00 | 0.01 | 0 | 0 | 7959.93 | 710 | 0.13 | 0.00 | 0.01 | 0 | 0 | 199.93 | 409 | 0.01 | 0.00 | 0.01 | 0 | 0 | 4.8 | 562 | 0.13 | 0.00 | 0.01 | 0 | 0 | 307.44 | 1629 | 0.00 | 0.00 | 0.00 | 0 | 0 | 542.87 | 684 | 0.01 | 0.00 | 0.00 | 0 | 0 | 227.6 |
| 404 | 0.01 | 0.00 | 0.01 | 0 | 0 | 0 | 1516 | 0.01 | 0.00 | 0.00 | 0 | 0 | 5145.77 | 1009 | 0.02 | 0.01 | 0.03 | 0 | 0 | 592.78 | 1452 | 0.00 | 0.00 | 0.00 | 0 | 0 | 0 | ||||||||||||||
| 699 | 0.09 | 0.00 | 0.01 | 0 | 0 | 379.85 | 448 | 0.02 | 0.00 | 0.00 | 0 | 0 | 239.89 | 476 | 0.12 | 0.07 | 0.26 | 0 | 0 | 0 | 686 | 0.01 | 0.00 | 0.00 | 0 | 0 | 433.54 | ||||||||||||||
| 1713 | 0.03 | 0.00 | 0.02 | 0 | 0 | 299.62 | 5 | 0.01 | 0.00 | 0.00 | 0 | 0 | 0 | ||||||||||||||||||||||||||||
| 447 | 0.02 | 0.00 | 0.01 | 0 | 0 | 80.11 | 351 | 0.01 | 0.00 | 0.01 | 0 | 0 | 11.04 | ||||||||||||||||||||||||||||
| 446 | 0.01 | 0.00 | 0.00 | 0 | 0 | 3399.33 | 906 | 0.18 | 0.00 | 0.02 | 0 | 0 | 168.97 | ||||||||||||||||||||||||||||
| 508 | 0.01 | 0.00 | 0.00 | 0 | 0 | 520.31 | |||||||||||||||||||||||||||||||||||
| No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | ||||||||||||||||||||||||||||||||||||
| Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | ||||||||||||||||||||||||||||||
▶quants.c: 298 - 0.14 %
| Run orig_default | Run gcc_default | Run aocc_default | Run icx_10 | Run gcc_4 | Run aocc_6 | ||||||||||||||||||||||||||||||||||||
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions | |||||||||||||||||||||||||||||||
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2925 | 0.28 | 0.01 | 0.03 | 60.7 | 29.66 | 1147.06 | 2122 | 0.36 | 0.01 | 0.03 | 60 | 28.75 | 1070.9 | 3339 | 0.29 | 0.01 | 0.03 | 58.33 | 28.75 | 1046.81 | 2663 | 0.35 | 0.01 | 0.03 | 60.7 | 29.66 | 1043.67 | 2154 | 0.31 | 0.01 | 0.03 | 59.65 | 29.28 | 1046.73 | |||||||
| Sum on 1 analyzed binary loop (libggml-cpu.so - 2925) | Sum on 1 analyzed binary loop (libggml-cpu.so - 2122) | Sum on 1 analyzed binary loop (libggml-cpu.so - 3339) | Sum on 1 analyzed binary loop (libggml-cpu.so - 2663) | Sum on 1 analyzed binary loop (libggml-cpu.so - 2154) | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | ||||||||||||||||||||||||||||||||||||
| Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | ||||||||||||||||||||||||||||||
| Loop Computation Issues | Loop Computation Issues | Loop Computation Issues | Loop Computation Issues | Loop Computation Issues | |||||||||||||||||||||||||||||||||||||
| Presence of expensive FP instructions | 1 | Presence of expensive FP instructions | 1 | Presence of expensive FP instructions | 1 | Presence of expensive FP instructions | 1 | Presence of expensive FP instructions | 1 | ||||||||||||||||||||||||||||||||
| Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 0 | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 0 | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 0 | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 0 | ||||||||||||||||||||||||||||||||
| Control Flow Issues | Control Flow Issues | Control Flow Issues | Control Flow Issues | Control Flow Issues | |||||||||||||||||||||||||||||||||||||
| Presence of 2 to 4 paths | 1 | Presence of 2 to 4 paths | 0 | Presence of 2 to 4 paths | 0 | Presence of 2 to 4 paths | 1 | Presence of 2 to 4 paths | 0 | ||||||||||||||||||||||||||||||||
| Presence of more than 4 paths | 0 | Presence of more than 4 paths | 1 | Presence of more than 4 paths | 1 | Presence of more than 4 paths | 0 | Presence of more than 4 paths | 1 | ||||||||||||||||||||||||||||||||
| Data Access Issues | Data Access Issues | Data Access Issues | Data Access Issues | Data Access Issues | |||||||||||||||||||||||||||||||||||||
| More than 10% of the vector loads instructions are unaligned | 1 | More than 10% of the vector loads instructions are unaligned | 1 | More than 10% of the vector loads instructions are unaligned | 1 | More than 10% of the vector loads instructions are unaligned | 1 | More than 10% of the vector loads instructions are unaligned | 1 | ||||||||||||||||||||||||||||||||
| Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 | ||||||||||||||||||||||||||||||||
| Vectorization Roadblocks | Vectorization Roadblocks | Vectorization Roadblocks | Vectorization Roadblocks | Vectorization Roadblocks | |||||||||||||||||||||||||||||||||||||
| Presence of 2 to 4 paths | 1 | Presence of 2 to 4 paths | 0 | Presence of 2 to 4 paths | 0 | Presence of 2 to 4 paths | 1 | Presence of 2 to 4 paths | 0 | ||||||||||||||||||||||||||||||||
| Presence of more than 4 paths | 0 | Presence of more than 4 paths | 1 | Presence of more than 4 paths | 1 | Presence of more than 4 paths | 0 | Presence of more than 4 paths | 1 | ||||||||||||||||||||||||||||||||
| Inefficient Vectorization | Inefficient Vectorization | Inefficient Vectorization | Inefficient Vectorization | Inefficient Vectorization | |||||||||||||||||||||||||||||||||||||
| Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 | ||||||||||||||||||||||||||||||||

