| | | | | | | requested parallelism | walltime sum (s) | nb instances | any sync average per thread time (s) | any wait average per thread time (s) | parallelism overhead (%) | local speedup if perfectly balanced | global speedup if perfectly balanced |
| start addr | function name | source location | level | ancestor thread num | invoker | parallel or teams | aocc_0 | aocc_0 | aocc_0 | aocc_0 | aocc_0 | aocc_0 | aocc_0 | aocc_0 |
| libggml-cpu.so:0x30c4c | ggml_graph_compute | ggml-cpu.c:3148 | 0 | 0 | runtime | parallel | 192 | 16.640 | 513 | 12.201 | 12.131 | 73.3 | 3.748 | 3.277 |
| libggml-cpu.so:0x56aff | ggml_backend_amx_convert_weight(ggml_tensor*, void const*, u... | mmq.cpp:2337 | 0 | 0 | runtime | parallel | 192 | 0.169 | 225 | 65.5 E-3 | 65.5 E-3 | 38.8 | 1.634 | 1.004 |