| Run orig_default | Run aocc_default | Run icx_4 | Run aocc_2 |
| Loop Source Regions | - /beegfs/hackathon/users/eoseret/qaas_runs_test/isix06.benchmarkcenter.megware.com/177-002-9891/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/amx/mmq.cpp: 303-330
- /beegfs/hackathon/users/eoseret/qaas_runs_test/isix06.benchmarkcenter.megware.com/177-002-9891/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/amx/mmq.cpp: 651-663
- /beegfs/hackathon/users/eoseret/qaas_runs_test/isix06.benchmarkcenter.megware.com/177-002-9891/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/amx/mmq.cpp: 816-818
- /beegfs/hackathon/users/eoseret/qaas_runs_test/isix06.benchmarkcenter.megware.com/177-002-9891/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/amx/mmq.cpp: 825-825
- /beegfs/hackathon/users/eoseret/qaas_runs_test/isix06.benchmarkcenter.megware.com/177-002-9891/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/amx/mmq.cpp: 1390-1392
| Loop Source Regions | - /beegfs/hackathon/users/eoseret/qaas_runs_test/isix06.benchmarkcenter.megware.com/177-002-9891/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/amx/mmq.cpp: 303-330
- /beegfs/hackathon/users/eoseret/qaas_runs_test/isix06.benchmarkcenter.megware.com/177-002-9891/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/amx/mmq.cpp: 651-663
- /beegfs/hackathon/users/eoseret/qaas_runs_test/isix06.benchmarkcenter.megware.com/177-002-9891/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/amx/mmq.cpp: 816-818
- /beegfs/hackathon/users/eoseret/qaas_runs_test/isix06.benchmarkcenter.megware.com/177-002-9891/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/amx/mmq.cpp: 825-825
- /beegfs/hackathon/users/eoseret/qaas_runs_test/isix06.benchmarkcenter.megware.com/177-002-9891/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/amx/mmq.cpp: 1390-1392
| Loop Source Regions | - /beegfs/hackathon/users/eoseret/qaas_runs_test/isix06.benchmarkcenter.megware.com/177-002-9891/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/amx/mmq.cpp: 303-330
- /beegfs/hackathon/users/eoseret/qaas_runs_test/isix06.benchmarkcenter.megware.com/177-002-9891/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/amx/mmq.cpp: 651-663
- /beegfs/hackathon/users/eoseret/qaas_runs_test/isix06.benchmarkcenter.megware.com/177-002-9891/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/amx/mmq.cpp: 816-818
- /beegfs/hackathon/users/eoseret/qaas_runs_test/isix06.benchmarkcenter.megware.com/177-002-9891/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/amx/mmq.cpp: 825-825
- /beegfs/hackathon/users/eoseret/qaas_runs_test/isix06.benchmarkcenter.megware.com/177-002-9891/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/amx/mmq.cpp: 1390-1392
| Loop Source Regions | - /beegfs/hackathon/users/eoseret/qaas_runs_test/isix06.benchmarkcenter.megware.com/177-002-9891/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/amx/mmq.cpp: 303-330
- /beegfs/hackathon/users/eoseret/qaas_runs_test/isix06.benchmarkcenter.megware.com/177-002-9891/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/amx/mmq.cpp: 651-663
- /beegfs/hackathon/users/eoseret/qaas_runs_test/isix06.benchmarkcenter.megware.com/177-002-9891/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/amx/mmq.cpp: 816-818
- /beegfs/hackathon/users/eoseret/qaas_runs_test/isix06.benchmarkcenter.megware.com/177-002-9891/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/amx/mmq.cpp: 825-825
- /beegfs/hackathon/users/eoseret/qaas_runs_test/isix06.benchmarkcenter.megware.com/177-002-9891/llama.cpp/build/llama.cpp/ggml/src/ggml-cpu/amx/mmq.cpp: 1390-1392
|
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s |
| 659 | 0.05 | 0.03 | 2.97 | 90.91 | 38.76 | 0 | 384 | 0.06 | 0.03 | 3.23 | 90.91 | 38.76 | 0 | 675 | 0.06 | 0.03 | 3.24 | 90.91 | 38.76 | 0 | 384 | 0.05 | 0.03 | 3.19 | 90.91 | 38.76 | 0 |
| | | |
| Sum on 1 analyzed binary loop (libggml-cpu.so - 659) | Sum on 1 analyzed binary loop (libggml-cpu.so - 384) | Sum on 1 analyzed binary loop (libggml-cpu.so - 675) | Sum on 1 analyzed binary loop (libggml-cpu.so - 384) |
| Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count |
| Loop Computation Issues | | Loop Computation Issues | | Loop Computation Issues | | Loop Computation Issues | |
| Presence of a large number of scalar integer instructions | 1 | Presence of a large number of scalar integer instructions | 1 | Presence of a large number of scalar integer instructions | 1 | Presence of a large number of scalar integer instructions | 1 |
| Data Access Issues | | Data Access Issues | | Data Access Issues | | Data Access Issues | |
| Presence of constant non-unit stride data access | 1 | Presence of constant non-unit stride data access | 1 | Presence of constant non-unit stride data access | 1 | Presence of constant non-unit stride data access | 1 |
| Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 |
| More than 20% of the loads are accessing the stack | 1 | More than 20% of the loads are accessing the stack | 1 | More than 20% of the loads are accessing the stack | 1 | More than 20% of the loads are accessing the stack | 1 |
| Vectorization Roadblocks | | Vectorization Roadblocks | | Vectorization Roadblocks | | Vectorization Roadblocks | |
| Presence of constant non-unit stride data access | 1 | Presence of constant non-unit stride data access | 1 | Presence of constant non-unit stride data access | 1 | Presence of constant non-unit stride data access | 1 |
| Inefficient Vectorization | | Inefficient Vectorization | | Inefficient Vectorization | | Inefficient Vectorization | |
| Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 |