Loops
vec.cpp: 311 - 0.36 %
| Run orig_default | Run aocc_default | Run gcc_default | Run icx_2 | Run aocc_4 | Run gcc_6 | ||||||||||||||||||||||||||||||||||||
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| ||||||||||||||||||||||||||||||
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1119 | 0.08 | 0.01 | 0.05 | 0 | 0 | 1214.8 | 1232 | 0.15 | 0.02 | 0.10 | 0 | 0 | 1168.45 | 959 | 0.13 | 0.02 | 0.06 | 100 | 66.67 | 1790.37 | 1177 | 0.08 | 0.01 | 0.04 | 0 | 0 | 1258.71 | 1232 | 0.17 | 0.02 | 0.09 | 0 | 0 | 1325 | 951 | 0.06 | 0.01 | 0.02 | 100 | 66.67 | 1504.1 |
| Sum on 1 analyzed binary loop (libggml-cpu.so - 1119) | Sum on 1 analyzed binary loop (libggml-cpu.so - 1232) | Sum on 1 analyzed binary loop (libggml-cpu.so - 959) | Sum on 1 analyzed binary loop (libggml-cpu.so - 1177) | Sum on 1 analyzed binary loop (libggml-cpu.so - 1232) | Sum on 1 analyzed binary loop (libggml-cpu.so - 951) | ||||||||||||||||||||||||||||||||||||
| Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | ||||||||||||||||||||||||||||||
| Control Flow Issues | Control Flow Issues | Control Flow Issues | Control Flow Issues | Control Flow Issues | Control Flow Issues | ||||||||||||||||||||||||||||||||||||
| Presence of 2 to 4 paths | 1 | Presence of 2 to 4 paths | Presence of 2 to 4 paths | Presence of 2 to 4 paths | 1 | Presence of 2 to 4 paths | Presence of 2 to 4 paths | ||||||||||||||||||||||||||||||||||
| Data Access Issues | Data Access Issues | Data Access Issues | Data Access Issues | Data Access Issues | Data Access Issues | ||||||||||||||||||||||||||||||||||||
| Presence of constant non-unit stride data access | Presence of constant non-unit stride data access | Presence of constant non-unit stride data access | 1 | Presence of constant non-unit stride data access | Presence of constant non-unit stride data access | Presence of constant non-unit stride data access | |||||||||||||||||||||||||||||||||||
| Vectorization Roadblocks | Vectorization Roadblocks | Vectorization Roadblocks | Vectorization Roadblocks | Vectorization Roadblocks | Vectorization Roadblocks | ||||||||||||||||||||||||||||||||||||
| Presence of 2 to 4 paths | 1 | Presence of 2 to 4 paths | Presence of 2 to 4 paths | 0 | Presence of 2 to 4 paths | 1 | Presence of 2 to 4 paths | Presence of 2 to 4 paths | |||||||||||||||||||||||||||||||||
| Presence of constant non-unit stride data access | 0 | Presence of constant non-unit stride data access | Presence of constant non-unit stride data access | 1 | Presence of constant non-unit stride data access | 0 | Presence of constant non-unit stride data access | Presence of constant non-unit stride data access | |||||||||||||||||||||||||||||||||
quants.c: 298 - 0.05 %
| Run orig_default | Run aocc_default | Run gcc_default | Run icx_2 | Run aocc_4 | Run gcc_6 | ||||||||||||||||||||||||||||||||||||
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| ||||||||||||||||||||||||||||||
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2924 | 0.33 | 0.00 | 0.01 | 60.7 | 29.66 | 801.66 | 3307 | 0.30 | 0.00 | 0.01 | 58.33 | 28.75 | 877.69 | 2214 | 0.67 | 0.00 | 0.01 | 60 | 28.75 | 413.37 | 3143 | 0.26 | 0.00 | 0.01 | 60.7 | 29.66 | 1022.75 | 3150 | 0.30 | 0.00 | 0.01 | 60.7 | 29.66 | 888.56 | 2188 | 0.39 | 0.00 | 0.01 | 59.65 | 29.28 | 675.34 |
| Sum on 1 analyzed binary loop (libggml-cpu.so - 2924) | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | Sum on 1 analyzed binary loop (libggml-cpu.so - 3143) | Sum on 1 analyzed binary loop (libggml-cpu.so - 3150) | Sum on 1 analyzed binary loop (libggml-cpu.so - 2188) | ||||||||||||||||||||||||||||||||||||
| Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | ||||||||||||||||||||||||||||||
| Loop Computation Issues | Loop Computation Issues | Loop Computation Issues | Loop Computation Issues | ||||||||||||||||||||||||||||||||||||||
| Presence of expensive FP instructions | 1 | Presence of expensive FP instructions | 1 | Presence of expensive FP instructions | 1 | Presence of expensive FP instructions | 1 | ||||||||||||||||||||||||||||||||||
| Control Flow Issues | Control Flow Issues | Control Flow Issues | Control Flow Issues | ||||||||||||||||||||||||||||||||||||||
| Presence of 2 to 4 paths | 1 | Presence of 2 to 4 paths | 1 | Presence of 2 to 4 paths | 1 | Presence of 2 to 4 paths | 0 | ||||||||||||||||||||||||||||||||||
| Presence of more than 4 paths | 0 | Presence of more than 4 paths | 0 | Presence of more than 4 paths | 0 | Presence of more than 4 paths | 1 | ||||||||||||||||||||||||||||||||||
| Data Access Issues | Data Access Issues | Data Access Issues | Data Access Issues | ||||||||||||||||||||||||||||||||||||||
| More than 10% of the vector loads instructions are unaligned | 1 | More than 10% of the vector loads instructions are unaligned | 1 | More than 10% of the vector loads instructions are unaligned | 1 | More than 10% of the vector loads instructions are unaligned | 1 | ||||||||||||||||||||||||||||||||||
| Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 | ||||||||||||||||||||||||||||||||||
| Vectorization Roadblocks | Vectorization Roadblocks | Vectorization Roadblocks | Vectorization Roadblocks | ||||||||||||||||||||||||||||||||||||||
| Presence of 2 to 4 paths | 1 | Presence of 2 to 4 paths | 1 | Presence of 2 to 4 paths | 1 | Presence of 2 to 4 paths | 0 | ||||||||||||||||||||||||||||||||||
| Presence of more than 4 paths | 0 | Presence of more than 4 paths | 0 | Presence of more than 4 paths | 0 | Presence of more than 4 paths | 1 | ||||||||||||||||||||||||||||||||||
| Inefficient Vectorization | Inefficient Vectorization | Inefficient Vectorization | Inefficient Vectorization | ||||||||||||||||||||||||||||||||||||||
| Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 | ||||||||||||||||||||||||||||||||||
binary-ops.cpp: 18 - 0.03 %
| Run orig_default | Run aocc_default | Run gcc_default | Run icx_2 | Run aocc_4 | Run gcc_6 | ||||||||||||||||||||||||||||||||||||
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| ||||||||||||||||||||||||||||||
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 812 | 0.16 | 0.00 | 0.00 | 0 | 6.25 | 183.12 | 914 | 0.20 | 0.00 | 0.01 | 0 | 6.25 | 134.05 | 716 | 0.28 | 0.00 | 0.01 | 100 | 50 | 90.99 | 848 | 0.16 | 0.00 | 0.00 | 0 | 6.25 | 170.71 | 914 | 0.22 | 0.00 | 0.01 | 0 | 6.25 | 110.66 | 705 | 0.08 | 0.00 | 0.00 | 0 | 6.25 | 282.74 |
| No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | ||||||||||||||||||||||||||||||||||||
| Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | ||||||||||||||||||||||||||||||
ops.cpp: 4325 - 0.02 %
| Run orig_default | Run aocc_default | Run gcc_default | Run icx_2 | Run aocc_4 | Run gcc_6 | ||||||||||||||||||||||||||||||||||||
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| ||||||||||||||||||||||||||||||
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1577 | 0.10 | 0.00 | 0.00 | 0 | 7.81 | 462.98 | 1758 | 0.15 | 0.00 | 0.00 | 0 | 7.81 | 358.82 | 1322 | 0.13 | 0.00 | 0.00 | 0 | 7.81 | 362.96 | 1725 | 0.16 | 0.00 | 0.00 | 0 | 7.81 | 343.24 | 1734 | 0.02 | 0.00 | 0.00 | 75 | 18.75 | 3628.12 | 1320 | 0.14 | 0.00 | 0.00 | 0 | 7.81 | 355.96 |
| No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | Sum on 1 analyzed binary loop (libggml-cpu.so - 1320) | ||||||||||||||||||||||||||||||||||||
| Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | ||||||||||||||||||||||||||||||
| Loop Computation Issues | |||||||||||||||||||||||||||||||||||||||||
| Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 | ||||||||||||||||||||||||||||||||||||||||
binary-ops.cpp: 10 - 0.02 %
| Run orig_default | Run aocc_default | Run gcc_default | Run icx_2 | Run aocc_4 | Run gcc_6 | ||||||||||||||||||||||||||||||||||||
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| ||||||||||||||||||||||||||||||
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 698 | 0.06 | 0.00 | 0.00 | 0 | 6.25 | 452.22 | 816 | 0.09 | 0.00 | 0.00 | 0 | 6.25 | 274.3 | 634 | 0.20 | 0.00 | 0.00 | 100 | 50 | 124.16 | 712 | 0.09 | 0.00 | 0.00 | 0 | 6.25 | 261.67 | 826 | 0.08 | 0.00 | 0.00 | 0 | 6.25 | 343.39 | 633 | 0.08 | 0.00 | 0.00 | 0 | 6.25 | 275.71 |
| No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | ||||||||||||||||||||||||||||||||||||
| Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | ||||||||||||||||||||||||||||||
sampling.cpp: 125 - 0.01 %
| Run orig_default | Run aocc_default | Run gcc_default | Run icx_2 | Run aocc_4 | Run gcc_6 | ||||||||||||||||||||||||||||||||||||
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| Loop Source Regions |
| ||||||||||||||||||||||||||||||
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1936 | 0.07 | 0.00 | 0.00 | 0 | 6.25 | 0 | 2992 | 0.07 | 0.00 | 0.00 | 0 | 6.25 | 0 | 2943 | 0.07 | 0.00 | 0.00 | 0 | 6.25 | 0 | 1963 | 0.11 | 0.00 | 0.00 | 33.33 | 8.33 | 0 | 3069 | 0.09 | 0.00 | 0.00 | 0 | 6.25 | 0 | 2881 | 0.11 | 0.00 | 0.00 | 0 | 6.25 | 0 |
| No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | ||||||||||||||||||||||||||||||||||||
| Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | ||||||||||||||||||||||||||||||

