| | | | | | | requested parallelism | walltime sum (s) | nb instances | any sync average per thread time (s) | any wait average per thread time (s) | parallelism overhead (%) | local speedup if perfectly balanced | global speedup if perfectly balanced |
| start addr | function name | source location | level | ancestor thread num | invoker | parallel or teams | 1x6 | 1x72 | 1x96 | 1x120 | 1x128 | 1x144 | 1x168 | 1x192 | 1x6 | 1x72 | 1x96 | 1x120 | 1x128 | 1x144 | 1x168 | 1x192 | 1x6 | 1x72 | 1x96 | 1x120 | 1x128 | 1x144 | 1x168 | 1x192 | 1x6 | 1x72 | 1x96 | 1x120 | 1x128 | 1x144 | 1x168 | 1x192 | 1x6 | 1x72 | 1x96 | 1x120 | 1x128 | 1x144 | 1x168 | 1x192 | 1x6 | 1x72 | 1x96 | 1x120 | 1x128 | 1x144 | 1x168 | 1x192 | 1x6 | 1x72 | 1x96 | 1x120 | 1x128 | 1x144 | 1x168 | 1x192 | 1x6 | 1x72 | 1x96 | 1x120 | 1x128 | 1x144 | 1x168 | 1x192 |
| libggml-cpu.so:0x30185 | ggml_graph_compute | ggml-cpu.c:3148 | 0 | 0 | runtime | parallel | 6 | 72 | 96 | 120 | 128 | 144 | 168 | 192 | 57.320 | 13.478 | 13.295 | 12.872 | 12.574 | 12.984 | 16.974 | 17.894 | 513 | 513 | 513 | 513 | 513 | 513 | 513 | 513 | 12.321 | 8.190 | 8.592 | 8.458 | 8.328 | 8.965 | 12.295 | 13.066 | 12.221 | 8.127 | 8.530 | 8.393 | 8.261 | 8.900 | 12.227 | 12.994 | 21.5 | 60.8 | 64.6 | 65.7 | 66.2 | 69.0 | 72.4 | 73.0 | 1.274 | 2.549 | 2.827 | 2.916 | 2.961 | 3.231 | 3.628 | 3.706 | 1.268 | 2.329 | 2.536 | 2.589 | 2.613 | 2.812 | 3.178 | 3.266 |
| libggml-cpu.so:0x57d9f | ggml_backend_amx_convert_weight(ggml_tensor*, void const*, u... | mmq.cpp:2337 | 0 | 0 | runtime | parallel | 6 | 72 | 96 | 120 | 128 | 144 | 168 | 192 | 0.375 | 0.132 | 0.138 | 0.143 | 0.143 | 0.147 | 0.153 | 0.170 | 225 | 225 | 225 | 225 | 225 | 225 | 225 | 225 | 42.5 E-3 | 41.6 E-3 | 46.4 E-3 | 56.4 E-3 | 40.8 E-3 | 51.3 E-3 | 60.8 E-3 | 67.8 E-3 | 42.4 E-3 | 41.6 E-3 | 46.4 E-3 | 56.4 E-3 | 40.7 E-3 | 51.3 E-3 | 60.8 E-3 | 67.8 E-3 | 11.3 | 31.6 | 33.7 | 39.5 | 28.5 | 34.9 | 39.8 | 39.9 | 1.128 | 1.462 | 1.508 | 1.653 | 1.399 | 1.536 | 1.662 | 1.663 | 1.001 | 1.003 | 1.003 | 1.004 | 1.003 | 1.004 | 1.003 | 1.004 |