| Run orig_default | Run icx_default | Run aocc_9 | Run icx_6 |
| Loop Source Regions | | Loop Source Regions | | Loop Source Regions | | Loop Source Regions | |
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
| 2640 | 0.00 | 0.00 | 0.00 | 0 | 0 | 2294 | 0.00 | 0.00 | 0.00 | 0 | 0 | 2263 | 0.00 | 0.00 | 0.00 | 0 | 0 | 2853 | 0.00 | 0.00 | 0.00 | 0 | 0 |
| 2738 | 0.00 | 0.00 | 0.00 | 0 | 0 | 2392 | 0.00 | 0.00 | 0.00 | 0 | 0 | 2448 | 0.00 | 0.00 | 0.00 | 0 | 0 | 3821 | 0.00 | 0.00 | 0.00 | 0 | 0 |
| 2638 | 0.00 | 0.00 | 0.00 | 0 | 0 | 1023 | 0.00 | 0.00 | 0.00 | 0 | 0 | 2349 | 0.02 | 0.00 | 0.00 | 0 | 0 | 4116 | 0.01 | 0.00 | 0.00 | 0 | 0 |
| 2741 | 0.01 | 0.00 | 0.00 | 0 | 0 | 2395 | 0.01 | 0.00 | 0.00 | 0 | 0 | 2131 | 0.00 | 0.00 | 0.00 | 0 | 0 | 4128 | 0.00 | 0.00 | 0.00 | 0 | 0 |
| 317 | 0.05 | 0.00 | 0.00 | 0 | 0 | 2155 | 0.00 | 0.00 | 0.00 | 0 | 0 | 298 | 0.06 | 0.00 | 0.00 | 0 | 0 | 4118 | 0.00 | 0.00 | 0.00 | 0 | 0 |
| 822 | 0.00 | 0.00 | 0.00 | 0 | 0 | 2504 | 0.00 | 0.00 | 0.00 | 0 | 0 | 1139 | 0.02 | 0.00 | 0.01 | 0 | 0 | 654 | 0.11 | 0.00 | 0.00 | 0 | 0 |
| 766 | 0.04 | 0.01 | 0.05 | 0 | 0 | 2298 | 0.00 | 0.00 | 0.00 | 0 | 0 | 1142 | 0.04 | 0.01 | 0.06 | 0 | 0 | 638 | 0.00 | 0.00 | 0.00 | 0 | 0 |
| 99 | 0.00 | 0.00 | 0.00 | 0 | 0 | 372 | 0.05 | 0.00 | 0.00 | 0 | 0 | 695 | 0.04 | 0.01 | 0.05 | 0 | 0 | 3515 | 0.01 | 0.00 | 0.00 | 0 | 0 |
| 485 | 0.03 | 0.01 | 0.03 | 0 | 0 | 815 | 0.00 | 0.00 | 0.00 | 0 | 0 | 689 | 0.02 | 0.00 | 0.01 | 0 | 0 | 2855 | 0.03 | 0.01 | 0.03 | 0 | 0 |
| 90 | 0.00 | 0.00 | 0.00 | 0 | 0 | 1948 | 0.01 | 0.00 | 0.00 | 0 | 0 | 423 | 0.03 | 0.01 | 0.02 | 0 | 0 | 105 | 0.00 | 0.00 | 0.00 | 0 | 0 |
| 2528 | 0.01 | 0.00 | 0.00 | 0 | 0 | 444 | 0.03 | 0.01 | 0.02 | 0 | 0 | 353 | 0.02 | 0.01 | 0.03 | 0 | 0 | 114 | 0.00 | 0.00 | 0.00 | 0 | 0 |
| 399 | 0.03 | 0.01 | 0.03 | 0 | 0 | 866 | 0.02 | 0.00 | 0.02 | 0 | 0 | 1297 | 0.01 | 0.00 | 0.00 | 0 | 0 | 631 | 0.03 | 0.01 | 0.02 | 0 | 0 |
| 5 | 0.01 | 0.00 | 0.01 | 0 | 0 | 124 | 0.01 | 0.00 | 0.00 | 0 | 0 | 1301 | 0.01 | 0.00 | 0.01 | 0 | 0 | 2844 | 0.01 | 0.00 | 0.01 | 0 | 0 |
| 1463 | 0.04 | 0.01 | 0.04 | 0 | 0 | 873 | 0.04 | 0.01 | 0.05 | 0 | 0 | 1951 | 0.01 | 0.00 | 0.00 | 0 | 0 | 3296 | 0.00 | 0.00 | 0.00 | 0 | 0 |
| 1457 | 0.02 | 0.00 | 0.01 | 0 | 0 | 2099 | 0.01 | 0.00 | 0.01 | 0 | 0 | 4 | 0.01 | 0.00 | 0.01 | 0 | 0 | 3292 | 0.01 | 0.00 | 0.00 | 0 | 0 |
| 1724 | 0.03 | 0.01 | 0.03 | 0 | 0 | 6 | 0.01 | 0.00 | 0.01 | 0 | 0 | 1412 | 0.00 | 0.00 | 0.00 | 0 | 0 | 2689 | 0.01 | 0.00 | 0.00 | 0 | 0 |
| 1196 | 0.03 | 0.01 | 0.04 | 0 | 0 | 1418 | 0.04 | 0.01 | 0.04 | 0 | 0 | 1312 | 0.03 | 0.01 | 0.04 | 0 | 0 | 2442 | 0.02 | 0.00 | 0.01 | 0 | 0 |
| 1197 | 0.01 | 0.00 | 0.01 | 0 | 0 | 1707 | 0.01 | 0.00 | 0.01 | 0 | 0 | 1317 | 0.01 | 0.00 | 0.00 | 0 | 0 | 525 | 0.01 | 0.00 | 0.00 | 0 | 0 |
| 1705 | 0.00 | 0.00 | 0.00 | 0 | 0 | 401 | 0.01 | 0.00 | 0.00 | 0 | 0 | 59 | 0.00 | 0.00 | 0.00 | 0 | 0 | 1635 | 0.04 | 0.01 | 0.05 | 0 | 0 |
| 1375 | 0.00 | 0.00 | 0.00 | 0 | 0 | 558 | 0.02 | 0.00 | 0.02 | 0 | 0 | 319 | 0.01 | 0.00 | 0.00 | 0 | 0 | 3 | 0.02 | 0.00 | 0.01 | 0 | 0 |
| 1731 | 0.01 | 0.00 | 0.00 | 0 | 0 | 2918 | 0.00 | 0.00 | 0.00 | 0 | 0 | 993 | 0.01 | 0.00 | 0.01 | 0 | 0 | 3309 | 0.01 | 0.00 | 0.01 | 0 | 0 |
| 1710 | 0.01 | 0.00 | 0.01 | 0 | 0 | 113 | 0.00 | 0.00 | 0.00 | 0 | 0 | 994 | 0.04 | 0.01 | 0.04 | 0 | 0 | 132 | 0.01 | 0.00 | 0.00 | 0 | 0 |
| 762 | 0.02 | 0.00 | 0.01 | 0 | 0 | 2094 | 0.01 | 0.00 | 0.00 | 0 | 0 | 71 | 0.01 | 0.00 | 0.00 | 0 | 0 | 2444 | 0.03 | 0.01 | 0.03 | 0 | 0 |
| 2109 | 0.03 | 0.01 | 0.04 | 0 | 0 | 8 | 0.00 | 0.00 | 0.00 | 0 | 0 | 893 | 0.04 | 0.01 | 0.02 | 0 | 0 |
| 1734 | 0.03 | 0.01 | 0.03 | 0 | 0 | | |
| 1412 | 0.01 | 0.00 | 0.01 | 0 | 0 | | |
| 2111 | 0.02 | 0.00 | 0.00 | 0 | 0 | | |
| 1605 | 0.00 | 0.00 | 0.00 | 0 | 0 | | |
| | | |
| No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. |
| Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count |
| Run orig_default | Run icx_default | Run aocc_9 | Run icx_6 |
| Loop Source Regions | - /beegfs/hackathon/users/eoseret/qaas_runs_test/gmz17.benchmarkcenter.megware.com/176-631-9244/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/llamafile/sgemm.cpp: 138-138
- /beegfs/hackathon/users/eoseret/qaas_runs_test/gmz17.benchmarkcenter.megware.com/176-631-9244/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/llamafile/sgemm.cpp: 818-846
- /beegfs/hackathon/users/eoseret/qaas_runs_test/gmz17.benchmarkcenter.megware.com/176-631-9244/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/llamafile/sgemm.cpp: 1038-1038
- /beegfs/hackathon/users/eoseret/qaas_runs_test/gmz17.benchmarkcenter.megware.com/176-631-9244/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/llamafile/sgemm.cpp: 1044-1044
| Loop Source Regions | - /beegfs/hackathon/users/eoseret/qaas_runs_test/gmz17.benchmarkcenter.megware.com/176-631-9244/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/llamafile/sgemm.cpp: 138-138
- /beegfs/hackathon/users/eoseret/qaas_runs_test/gmz17.benchmarkcenter.megware.com/176-631-9244/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/llamafile/sgemm.cpp: 818-821
- /beegfs/hackathon/users/eoseret/qaas_runs_test/gmz17.benchmarkcenter.megware.com/176-631-9244/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/llamafile/sgemm.cpp: 827-846
- /beegfs/hackathon/users/eoseret/qaas_runs_test/gmz17.benchmarkcenter.megware.com/176-631-9244/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/llamafile/sgemm.cpp: 966-966
- /beegfs/hackathon/users/eoseret/qaas_runs_test/gmz17.benchmarkcenter.megware.com/176-631-9244/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/llamafile/sgemm.cpp: 1038-1038
- /beegfs/hackathon/users/eoseret/qaas_runs_test/gmz17.benchmarkcenter.megware.com/176-631-9244/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/llamafile/sgemm.cpp: 1044-1044
| Loop Source Regions | - /beegfs/hackathon/users/eoseret/qaas_runs_test/gmz17.benchmarkcenter.megware.com/176-631-9244/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/llamafile/sgemm.cpp: 138-138
- /beegfs/hackathon/users/eoseret/qaas_runs_test/gmz17.benchmarkcenter.megware.com/176-631-9244/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/llamafile/sgemm.cpp: 826-846
- /beegfs/hackathon/users/eoseret/qaas_runs_test/gmz17.benchmarkcenter.megware.com/176-631-9244/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/llamafile/sgemm.cpp: 966-966
- /beegfs/hackathon/users/eoseret/qaas_runs_test/gmz17.benchmarkcenter.megware.com/176-631-9244/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/llamafile/sgemm.cpp: 1038-1038
- /beegfs/hackathon/users/eoseret/qaas_runs_test/gmz17.benchmarkcenter.megware.com/176-631-9244/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/llamafile/sgemm.cpp: 1044-1044
| Loop Source Regions | |
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
| 2365 | 0.08 | 0.05 | 0.22 | 0 | 0 | 2688 | 0.09 | 0.06 | 0.22 | 0 | 0 | 1814 | 0.07 | 0.04 | 0.15 | 0 | 0 | |
| | | |
| Sum on 1 analyzed binary loop (libggml-cpu.so - 2365) | Sum on 1 analyzed binary loop (libggml-cpu.so - 2688) | Sum on 1 analyzed binary loop (libggml-cpu.so - 1814) | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. |
| Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count |
| Run orig_default | Run icx_default | Run aocc_9 | Run icx_6 |
| Loop Source Regions | | Loop Source Regions | | Loop Source Regions | | Loop Source Regions | - /beegfs/hackathon/users/eoseret/qaas_runs_test/gmz17.benchmarkcenter.megware.com/176-631-9244/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/simd-mappings.h: 130-130
- /beegfs/hackathon/users/eoseret/qaas_runs_test/gmz17.benchmarkcenter.megware.com/176-631-9244/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/arch/x86/quants.c: 1066-1073
|
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
| | | 3704 | 0.15 | 0.13 | 0.52 | 88.89 | 20.14 |
| | | |
| No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | Sum on 1 analyzed binary loop (libggml-cpu.so - 3704) |
| Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count |
| | | | | | Loop Computation Issues | |
| | | | | | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 |
| | | | | | Data Access Issues | |
| | | | | | Presence of indirect access | 1 |
| | | | | | Presence of special instructions executing on a single port | 1 |
| | | | | | Vectorization Roadblocks | |
| | | | | | Presence of indirect access | 1 |
| | | | | | Inefficient Vectorization | |
| | | | | | Presence of special instructions executing on a single port | 1 |
| Run orig_default | Run icx_default | Run aocc_9 | Run icx_6 |
| Loop Source Regions | | Loop Source Regions | | Loop Source Regions | | Loop Source Regions | - /beegfs/hackathon/users/eoseret/qaas_runs_test/gmz17.benchmarkcenter.megware.com/176-631-9244/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/simd-mappings.h: 130-130
- /beegfs/hackathon/users/eoseret/qaas_runs_test/gmz17.benchmarkcenter.megware.com/176-631-9244/llama.cpp/build/llama.cpp/ggml/src/./ggml-impl.h: 346-346
- /beegfs/hackathon/users/eoseret/qaas_runs_test/gmz17.benchmarkcenter.megware.com/176-631-9244/llama.cpp/build/llama.cpp/ggml/src/./ggml-impl.h: 389-389
- /beegfs/hackathon/users/eoseret/qaas_runs_test/gmz17.benchmarkcenter.megware.com/176-631-9244/llama.cpp/build/llama.cpp/ggml/src/./ggml-impl.h: 399-404
- /beegfs/hackathon/users/eoseret/qaas_runs_test/gmz17.benchmarkcenter.megware.com/176-631-9244/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/vec.h: 508-509
|
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
| | | 3306 | 0.18 | 0.13 | 0.51 | 85.05 | 20.79 |
| | | |
| No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | Sum on 1 analyzed binary loop (libggml-cpu.so - 3306) |
| Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count |
| | | | | | Loop Computation Issues | |
| | | | | | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 |
| | | | | | Data Access Issues | |
| | | | | | Presence of indirect access | 1 |
| | | | | | Presence of special instructions executing on a single port | 1 |
| | | | | | Vectorization Roadblocks | |
| | | | | | Presence of indirect access | 1 |
| | | | | | Inefficient Vectorization | |
| | | | | | Presence of special instructions executing on a single port | 1 |
| Run orig_default | Run icx_default | Run aocc_9 | Run icx_6 |
| Loop Source Regions | | Loop Source Regions | | Loop Source Regions | | Loop Source Regions | - /beegfs/hackathon/users/eoseret/qaas_runs_test/gmz17.benchmarkcenter.megware.com/176-631-9244/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/simd-mappings.h: 130-130
- /beegfs/hackathon/users/eoseret/qaas_runs_test/gmz17.benchmarkcenter.megware.com/176-631-9244/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/vec.cpp: 331-332
|
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
| | | 1621 | 0.11 | 0.06 | 0.26 | 5.88 | 8.82 |
| | | |
| No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | Sum on 1 analyzed binary loop (libggml-cpu.so - 1621) |
| Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count |
| | | | | | Loop Computation Issues | |
| | | | | | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 |
| | | | | | Data Access Issues | |
| | | | | | Presence of indirect access | 1 |
| | | | | | Vectorization Roadblocks | |
| | | | | | Presence of indirect access | 1 |