| Loop id | Source Location | Source Function | Level | Exclusive Coverage gcc_4 (%) | Inclusive Coverage gcc_4 (%) | Max Exclusive Time Over Threads gcc_4 (s) | Max Inclusive Time Over Threads gcc_4 (s) | Exclusive Time w.r.t. Wall Time gcc_4 (s) | Inclusive Time w.r.t. Wall Time gcc_4 (s) | Nb Threads gcc_4 | GFLOPS gcc_4 | Vectorization Ratio (%) | Vector Length Use (%) | Speedup If No Scalar Integer | Speedup If FP Vectorized | Speedup If Fully Vectorized | Speedup If Perfect Load Balancing gcc_4 | Stride 0 | Stride 1 | Stride n | Stride Unknown | Stride Indirect | Array Access Efficiency |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 400 | libggml-cpu.so - mmq.cpp:1597-1597 [...] | (anonymous namespace)::tinygemm_kernel_vnni<block_q8_0, block_q8_0, float, 1, 64, 32>::apply(int, void const*, void const*, float*, int) | Single | 86.79 | 86.79 | 24.73 | 24.73 | 24.14 | 24.14 | 192 | 89.52 | 87.67 | 85.7 | 1 | 1 | 1.07 | 1.04 | 0 | 4 | 4 | 0 | 1 | 77.78 |
| 2154 | libggml-cpu.so - quants.c:298-321 [...] | quantize_row_q8_0 | Single | 0.03 | 0.03 | 0.31 | 0.31 | 0.01 | 0.01 | 6 | 1046.73 | 59.65 | 29.28 | 1.03 | 1.4 | 2.86 | 1.23 | NA | NA | NA | NA | NA | 0.00 |