| | | | | | | requested parallelism | walltime sum (s) | nb instances | any sync average per thread time (s) | any wait average per thread time (s) | parallelism overhead (%) | local speedup if perfectly balanced | global speedup if perfectly balanced |
| start addr | function name | source location | level | ancestor thread num | invoker | parallel or teams | icx_2 | icx_2 | icx_2 | icx_2 | icx_2 | icx_2 | icx_2 | icx_2 |
| libggml-cpu.so:0x1686c | ggml_graph_compute | ggml-cpu.c:682 | 0 | 0 | runtime | parallel | 192 | 18.477 | 513 | 13.589 | 13.529 | 73.5 | 3.780 | 3.351 |
| libggml-cpu.so:0x4d130 | ggml_backend_amx_convert_weight(ggml_tensor*, void const*, u... | mmq.cpp:2337 | 0 | 0 | runtime | parallel | 192 | 0.158 | 225 | 67.5 E-3 | 67.5 E-3 | 42.7 | 1.745 | 1.003 |