| Name | Module | Coverage (%) | Inclusive Time w.r.t. Wall Time(s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (coverage) | Deviation (time) |
| orig_default | gcc_default | armclang_4 | gcc_4 | orig_default | gcc_default | armclang_4 | gcc_4 | orig_default | gcc_default | armclang_4 | gcc_4 | orig_default | gcc_default | armclang_4 | gcc_4 | orig_default | gcc_default | armclang_4 | gcc_4 | orig_default | gcc_default | armclang_4 | gcc_4 |
| kai_run_matmul_clamp_f32_qsi8d32p4x8_qsi4c32p4x8_16x4_neon_i8mm | libggml-cpu.so | 53.68 | 55.18 | 55.24 | 56.16 | 3.82 | 3.83 | 3.75 | 3.86 | 3.37 | 3.41 | 3.33 | 3.36 | 64 | 64 | 64 | 64 | 2.23 | 1.94 | 2.26 | 1.71 | 0.09 | 0.10 | 0.09 | 0.08 |
| ggml_vec_dot_q6_K_q8_K | libggml-cpu.so | 17.53 | 16.98 | 16.37 | 15.76 | 1.25 | 1.18 | 1.11 | 1.08 | 1.12 | 1.04 | 0.98 | 0.95 | 64 | 64 | 64 | 64 | 1.13 | 0.66 | 0.62 | 0.54 | 0.05 | 0.03 | 0.02 | 0.02 |
| gomp_team_barrier_wait_end | libgomp.so.1.0.0 | NA | 19.94 | NA | 19.92 | NA | 1.38 | NA | 1.37 | NA | 1.44 | NA | 1.48 | NA | 64 | NA | 64 | NA | 2.29 | NA | 2.02 | NA | 0.13 | NA | 0.11 |
| kmp_flag_64<false, true>::wait(kmp_info*, int, void*) | libomp.so | 17.74 | NA | 17.53 | NA | 1.26 | NA | 1.19 | NA | 1.51 | NA | 1.38 | NA | 64 | NA | 64 | NA | 2.01 | NA | 1.94 | NA | 0.13 | NA | 0.12 | NA |
| ggml_compute_forward_flash_attn_ext | libggml-cpu.so | 1.14 | 1.18 | 1.39 | 1.38 | 0.08 | 0.08 | 0.09 | 0.09 | 0.11 | 0.12 | 0.13 | 0.11 | 64 | 64 | 64 | 64 | 0.35 | 0.36 | 0.33 | 0.29 | 0.02 | 0.02 | 0.02 | 0.02 |
| kmp_flag_native<unsigned long long, (flag_type)1, true>::notdone_check() | libomp.so | 1.93 | NA | 1.97 | NA | 0.14 | NA | 0.13 | NA | 0.18 | NA | 0.19 | NA | 64 | NA | 64 | NA | 0.49 | NA | 0.50 | NA | 0.03 | NA | 0.03 | NA |
| ggml_compute_forward_rope_f32(ggml_compute_params const*, ggml_tensor*, bool) | libggml-cpu.so | 0.83 | 0.95 | 0.84 | 0.94 | 0.06 | 0.07 | 0.06 | 0.06 | 0.08 | 0.09 | 0.09 | 0.10 | 64 | 64 | 64 | 64 | 0.24 | 0.26 | 0.24 | 0.28 | 0.01 | 0.01 | 0.01 | 0.02 |
| __pthread_mutex_lock | libc.so.6 | 0.69 | 0.55 | 0.59 | 0.56 | 0.05 | 0.04 | 0.04 | 0.04 | 0.09 | 0.08 | 0.09 | 0.07 | 61 | 63 | 56 | 63 | 0.41 | 0.29 | 0.37 | 0.25 | 0.02 | 0.02 | 0.02 | 0.01 |
| ggml_vec_dot_f16 | libggml-cpu.so | 0.48 | 0.49 | 0.57 | 0.59 | 0.03 | 0.03 | 0.04 | 0.04 | 0.06 | 0.06 | 0.07 | 0.06 | 64 | 64 | 64 | 64 | 0.18 | 0.18 | 0.24 | 0.23 | 0.01 | 0.01 | 0.01 | 0.01 |
| __aarch64_ldadd4_acq_rel | libgomp.so.1.0.0 | NA | 0.99 | NA | 0.92 | NA | 0.07 | NA | 0.06 | NA | 0.11 | NA | 0.09 | NA | 64 | NA | 64 | NA | 0.38 | NA | 0.33 | NA | 0.02 | NA | 0.02 |
| ggml_vec_swiglu_f32 | libggml-cpu.so | 0.54 | 0.32 | 0.36 | 0.38 | 0.04 | 0.02 | 0.02 | 0.03 | 0.20 | 0.12 | 0.12 | 0.12 | 16 | 16 | 16 | 16 | 0.53 | 0.33 | 0.32 | 0.34 | 0.03 | 0.02 | 0.02 | 0.02 |
| __sched_yield | libc.so.6 | 0.67 | NA | 0.65 | NA | 0.05 | NA | 0.04 | NA | 0.07 | NA | 0.08 | NA | 63 | NA | 64 | NA | 0.23 | NA | 0.25 | NA | 0.01 | NA | 0.01 | NA |
| kai_run_rhs_pack_nxk_qsi4c32pscalef16_qsu4c32s16s0 | libggml-cpu.so | 0.18 | 0.21 | 0.19 | 0.22 | 0.01 | 0.01 | 0.01 | 0.01 | 0.68 | 0.76 | 0.70 | 0.79 | 1 | 1 | 1 | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| kai_run_lhs_quant_pack_qsi8d32p4x8sb_f32_neon | libggml-cpu.so | 0.23 | 0.18 | 0.17 | 0.18 | 0.02 | 0.01 | 0.01 | 0.01 | 0.24 | 0.21 | 0.17 | 0.20 | 4 | 4 | 4 | 4 | 0.33 | 0.71 | 0.22 | 0.22 | 0.02 | 0.03 | 0.02 | 0.03 |
| unknown_function | [vdso] | 0.32 | NA | 0.32 | NA | 0.02 | NA | 0.02 | NA | 0.05 | NA | 0.06 | NA | 63 | NA | 59 | NA | 0.17 | NA | 0.19 | NA | 0.01 | NA | 0.01 | NA |
| sincosf | libm.so.6 | NA | 0.29 | NA | 0.30 | NA | 0.02 | NA | 0.02 | NA | 0.05 | NA | 0.05 | NA | 63 | NA | 63 | NA | 0.16 | NA | 0.15 | NA | 0.01 | NA | 0.01 |
| gomp_barrier_wait_end | libgomp.so.1.0.0 | NA | 0.27 | NA | 0.27 | NA | 0.02 | NA | 0.02 | NA | 0.04 | NA | 0.04 | NA | 63 | NA | 58 | NA | 0.15 | NA | 0.13 | NA | 0.01 | NA | 0.01 |
| __aarch64_ldadd8_acq_rel | libomp.so | 0.26 | NA | 0.26 | NA | 0.02 | NA | 0.02 | NA | 0.09 | NA | 0.06 | NA | 56 | NA | 53 | NA | 0.25 | NA | 0.23 | NA | 0.02 | NA | 0.01 | NA |
| __sincosf_finite | libamath.so | 0.23 | NA | 0.27 | NA | 0.02 | NA | 0.02 | NA | 0.04 | NA | 0.04 | NA | 61 | NA | 61 | NA | 0.14 | NA | 0.13 | NA | 0.01 | NA | 0.01 | NA |
| __expf_finite | libamath.so | 0.25 | NA | 0.24 | NA | 0.02 | NA | 0.02 | NA | 0.04 | NA | 0.04 | NA | 62 | NA | 61 | NA | 0.13 | NA | 0.13 | NA | 0.01 | NA | 0.01 | NA |
| ggml_compute_forward_add_non_quantized | libggml-cpu.so | 0.12 | 0.12 | 0.12 | 0.12 | 0.01 | 0.01 | 0.01 | 0.01 | 0.05 | 0.04 | 0.04 | 0.05 | 23 | 27 | 28 | 23 | 0.23 | 0.19 | 0.20 | 0.26 | 0.01 | 0.01 | 0.01 | 0.02 |
| ggml_compute_forward_mul_mat | libggml-cpu.so | 0.14 | 0.09 | 0.10 | 0.10 | 0.01 | 0.01 | 0.01 | 0.01 | 0.02 | 0.02 | 0.02 | 0.02 | 51 | 40 | 46 | 43 | 0.08 | 0.07 | 0.06 | 0.07 | 0.00 | 0.00 | 0.00 | 0.00 |
| __kmp_hyper_barrier_release(barrier_type, kmp_info*, int, int, int, void*) | libomp.so | 0.21 | NA | 0.21 | NA | 0.01 | NA | 0.01 | NA | 0.04 | NA | 0.04 | NA | 59 | NA | 59 | NA | 0.12 | NA | 0.13 | NA | 0.01 | NA | 0.01 | NA |
| __expf_finite | libm.so.6 | NA | 0.23 | NA | 0.19 | NA | 0.02 | NA | 0.01 | NA | 0.03 | NA | 0.03 | NA | 59 | NA | 61 | NA | 0.13 | NA | 0.10 | NA | 0.01 | NA | 0.01 |
| ggml_cpu_fp32_to_fp16 | libggml-cpu.so | 0.09 | 0.11 | 0.13 | 0.09 | 0.01 | 0.01 | 0.01 | 0.01 | 0.02 | 0.02 | 0.03 | 0.03 | 39 | 41 | 47 | 42 | 0.08 | 0.10 | 0.10 | 0.07 | 0.01 | 0.01 | 0.01 | 0.00 |
| ggml_graph_compute_thread | libggml-cpu.so | 0.22 | NA | 0.18 | NA | 0.02 | NA | 0.01 | NA | 0.04 | NA | 0.03 | NA | 55 | NA | 50 | NA | 0.14 | NA | 0.15 | NA | 0.01 | NA | 0.01 | NA |
| ggml_compute_forward_mul | libggml-cpu.so | 0.11 | 0.08 | 0.13 | 0.07 | 0.01 | 0.01 | 0.01 | 0.01 | 0.04 | 0.03 | 0.04 | 0.03 | 32 | 25 | 35 | 25 | 0.14 | 0.12 | 0.17 | 0.13 | 0.01 | 0.01 | 0.01 | 0.01 |
| ggml_compute_forward_rms_norm | libggml-cpu.so | 0.14 | 0.13 | 0.05 | 0.05 | 0.01 | 0.01 | 0.00 | 0.00 | 0.06 | 0.05 | 0.02 | 0.03 | 17 | 18 | 16 | 19 | 0.32 | 0.22 | 0.09 | 0.09 | 0.02 | 0.01 | 0.01 | 0.00 |
| ggml::cpu::kleidiai::extra_buffer_type::get_tensor_traits(ggml_tensor const*) | libggml-cpu.so | 0.12 | 0.07 | 0.07 | 0.05 | 0.01 | 0.00 | 0.00 | 0.00 | 0.03 | 0.02 | 0.02 | 0.01 | 47 | 33 | 33 | 32 | 0.09 | 0.06 | 0.08 | 0.04 | 0.01 | 0.00 | 0.00 | 0.00 |
| ggml_graph_compute_thread | libggml-cpu.so | NA | 0.13 | NA | 0.13 | NA | 0.01 | NA | 0.01 | NA | 0.03 | NA | 0.03 | NA | 50 | NA | 50 | NA | 0.10 | NA | 0.09 | NA | 0.01 | NA | 0.01 |
| unknown_function | libggml-cpu.so | 0.07 | 0.04 | 0.06 | 0.06 | 0.00 | 0.00 | 0.00 | 0.00 | 0.02 | 0.02 | 0.02 | 0.02 | 37 | 23 | 30 | 32 | 0.06 | 0.05 | 0.07 | 0.06 | 0.00 | 0.00 | 0.00 | 0.00 |
| __GI___lll_lock_wake | libc.so.6 | 0.05 | 0.06 | 0.05 | 0.05 | 0.00 | 0.00 | 0.00 | 0.00 | 0.01 | 0.02 | 0.02 | 0.02 | 26 | 35 | 26 | 27 | 0.05 | 0.06 | 0.05 | 0.06 | 0.00 | 0.00 | 0.00 | 0.00 |
| __GI___lll_lock_wait | libc.so.6 | 0.05 | 0.04 | 0.06 | 0.05 | 0.00 | 0.00 | 0.00 | 0.00 | 0.02 | 0.01 | 0.02 | 0.01 | 27 | 22 | 33 | 31 | 0.05 | 0.05 | 0.05 | 0.05 | 0.00 | 0.00 | 0.00 | 0.00 |
| __kmp_hyper_barrier_gather(barrier_type, kmp_info*, int, int, void (*)(void*, void*), void*) | libomp.so | 0.09 | NA | 0.09 | NA | 0.01 | NA | 0.01 | NA | 0.03 | NA | 0.03 | NA | 37 | NA | 39 | NA | 0.09 | NA | 0.09 | NA | 0.00 | NA | 0.01 | NA |
| __memcpy | libastring.so | 0.09 | NA | 0.08 | NA | 0.01 | NA | 0.01 | NA | 0.08 | NA | 0.08 | NA | 33 | NA | 32 | NA | 0.17 | NA | 0.18 | NA | 0.01 | NA | 0.01 | NA |
| kmp_flag_native<unsigned long long, (flag_type)1, true>::done_check() | libomp.so | 0.08 | NA | 0.08 | NA | 0.01 | NA | 0.01 | NA | 0.04 | NA | 0.05 | NA | 19 | NA | 17 | NA | 0.14 | NA | 0.19 | NA | 0.01 | NA | 0.01 | NA |
| __kmp_barrier | libomp.so | 0.07 | NA | 0.07 | NA | 0.00 | NA | 0.00 | NA | 0.03 | NA | 0.03 | NA | 32 | NA | 35 | NA | 0.08 | NA | 0.06 | NA | 0.01 | NA | 0.00 | NA |
| ggml_is_empty | libggml-base.so | 0.02 | 0.03 | 0.03 | 0.03 | 0.00 | 0.00 | 0.00 | 0.00 | 0.01 | 0.01 | 0.01 | 0.01 | 14 | 21 | 17 | 17 | 0.03 | 0.03 | 0.05 | 0.06 | 0.00 | 0.00 | 0.00 | 0.00 |
| @plt_start@ | libomp.so | 0.05 | NA | 0.05 | NA | 0.00 | NA | 0.00 | NA | 0.02 | NA | 0.01 | NA | 26 | NA | 31 | NA | 0.06 | NA | 0.04 | NA | 0.00 | NA | 0.00 | NA |
| ggml_cpu_extra_compute_forward | libggml-cpu.so | 0.01 | 0.03 | 0.01 | 0.03 | 0.00 | 0.00 | 0.00 | 0.00 | 0.01 | 0.01 | 0.01 | 0.01 | 9 | 20 | 7 | 20 | 0.00 | 0.05 | 0.07 | 0.02 | 0.00 | 0.00 | 0.00 | 0.00 |
| ggml_compute_forward_set_rows | libggml-cpu.so | 0.02 | 0.01 | 0.02 | 0.03 | 0.00 | 0.00 | 0.00 | 0.00 | 0.01 | 0.01 | 0.02 | 0.02 | 16 | 10 | 10 | 18 | 0.03 | 0.03 | 0.06 | 0.05 | 0.00 | 0.00 | 0.00 | 0.00 |
| __kmpc_barrier | libomp.so | 0.03 | NA | 0.02 | NA | 0.00 | NA | 0.00 | NA | 0.01 | NA | 0.02 | NA | 19 | NA | 14 | NA | 0.03 | NA | 0.07 | NA | 0.00 | NA | 0.00 | NA |
| ggml::cpu::kleidiai::tensor_traits::compute_forward_q4_0(ggml_compute_params*, ggml_tensor*) [clone .isra.0] | libggml-cpu.so | NA | 0.02 | NA | 0.02 | NA | 0.00 | NA | 0.00 | NA | 0.01 | NA | 0.01 | NA | 16 | NA | 17 | NA | 0.02 | NA | 0.00 | NA | 0.00 | NA | 0.00 |
| ggml_backend_cpu_get_extra_buffer_types() | libggml-cpu.so | 0.01 | 0.01 | 0.01 | 0.02 | 0.00 | 0.00 | 0.00 | 0.00 | 0.01 | 0.01 | 0.01 | 0.01 | 3 | 6 | 6 | 11 | 0.05 | 0.00 | 0.00 | 0.03 | 0.00 | 0.00 | 0.00 | 0.00 |
| __GI___pthread_mutex_unlock_usercnt | libc.so.6 | 0.01 | 0.01 | 0.01 | 0.01 | 0.00 | 0.00 | 0.00 | 0.00 | 0.01 | 0.01 | 0.00 | 0.01 | 5 | 7 | 9 | 3 | 0.04 | 0.03 | 0.00 | 0.05 | 0.00 | 0.00 | 0.00 | 0.00 |
| ggml_compute_forward_add | libggml-cpu.so | 0.01 | 0.01 | 0.01 | 0.01 | 0.00 | 0.00 | 0.00 | 0.00 | 0.01 | 0.01 | 0.00 | 0.00 | 7 | 10 | 5 | 5 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| __memset | libastring.so | 0.02 | NA | 0.02 | NA | 0.00 | NA | 0.00 | NA | 0.05 | NA | 0.04 | NA | 4 | NA | 7 | NA | 0.26 | NA | 0.19 | NA | 0.02 | NA | 0.01 | NA |
| ggml::cpu::kleidiai::tensor_traits::compute_forward_q4_0(ggml_compute_params*, ggml_tensor*) | libggml-cpu.so | 0.02 | NA | 0.02 | NA | 0.00 | NA | 0.00 | NA | 0.01 | NA | 0.01 | NA | 14 | NA | 10 | NA | 0.00 | NA | 0.04 | NA | 0.00 | NA | 0.00 | NA |
| __fs_pow_1 | libamath.so | 0.02 | NA | 0.01 | NA | 0.00 | NA | 0.00 | NA | 0.01 | NA | 0.01 | NA | 17 | NA | 10 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA |
| __log2_finite | libm.so.6 | NA | 0.02 | NA | 0.02 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.01 | NA | 12 | NA | 12 | NA | 0.00 | NA | 0.04 | NA | 0.00 | NA | 0.00 |
| __kmp_yield | libomp.so | 0.01 | NA | 0.02 | NA | 0.00 | NA | 0.00 | NA | 0.01 | NA | 0.01 | NA | 6 | NA | 15 | NA | 0.05 | NA | 0.03 | NA | 0.00 | NA | 0.00 | NA |
| ggml_type_size | libggml-base.so | 0.01 | 0.01 | 0.01 | 0.01 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.01 | 0.01 | 0.00 | 5 | 6 | 6 | 6 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| __kmp_now_nsec | libomp.so | 0.02 | NA | 0.01 | NA | 0.00 | NA | 0.00 | NA | 0.01 | NA | 0.01 | NA | 12 | NA | 9 | NA | 0.02 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA |
| unknown_function | libggml-base.so | 0.01 | 0.00 | 0.00 | 0.02 | 0.00 | 0.00 | 0.00 | 0.00 | 0.01 | 0.01 | 0.00 | 0.01 | 6 | 1 | 2 | 11 | 0.00 | 0.00 | 0.01 | 0.03 | 0.00 | 0.00 | 0.00 | 0.00 |
| gomp_team_barrier_wait | libgomp.so.1.0.0 | NA | 0.01 | NA | 0.01 | NA | 0.00 | NA | 0.00 | NA | 0.01 | NA | 0.01 | NA | 11 | NA | 7 | NA | 0.00 | NA | 0.03 | NA | 0.00 | NA | 0.00 |
| ggml_compute_forward_glu | libggml-cpu.so | 0.01 | 0.01 | 0.00 | 0.01 | 0.00 | 0.00 | 0.00 | 0.00 | 0.01 | 0.01 | 0.00 | 0.01 | 4 | 5 | 3 | 6 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| __ieee754_log2 | libamath.so | 0.01 | NA | 0.01 | NA | 0.00 | NA | 0.00 | NA | 0.01 | NA | 0.01 | NA | 8 | NA | 9 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA |
| llama_vocab::impl::load(llama_model_loader&, LLM_KV const&) | libllama.so | 0.00 | 0.01 | 0.01 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.01 | 0.04 | 0.02 | 0.01 | 1 | 1 | 1 | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| __exp2f_finite | libamath.so | 0.01 | NA | 0.01 | NA | 0.00 | NA | 0.00 | NA | 0.01 | NA | 0.01 | NA | 7 | NA | 8 | NA | 0.03 | NA | 0.01 | NA | 0.00 | NA | 0.00 | NA |
| ggml::cpu::repack::extra_buffer_type::get_tensor_traits(ggml_tensor const*) | libggml-cpu.so | 0.01 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.01 | 0.01 | 0.01 | 0.00 | 10 | 3 | 1 | 1 | 0.03 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| std::_Hashtable<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, unsigned char>, s... | libllama.so | 0.01 | 0.00 | 0.01 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.02 | 0.02 | 0.03 | 0.01 | 1 | 1 | 1 | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| ggml_is_contiguous_1 | libggml-base.so | 0.01 | NA | 0.01 | NA | 0.00 | NA | 0.00 | NA | 0.01 | NA | 0.01 | NA | 5 | NA | 9 | NA | 0.00 | NA | 0.03 | NA | 0.00 | NA | 0.00 | NA |
| std::_Hashtable<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::pair<std::pai... | libllama.so | NA | 0.01 | NA | 0.01 | NA | 0.00 | NA | 0.00 | NA | 0.04 | NA | 0.04 | NA | 1 | NA | 1 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.00 |
| GOMP_barrier | libgomp.so.1.0.0 | NA | 0.01 | NA | 0.01 | NA | 0.00 | NA | 0.00 | NA | 0.01 | NA | 0.01 | NA | 6 | NA | 6 | NA | 0.05 | NA | 0.01 | NA | 0.00 | NA | 0.00 |
| std::_Hashtable<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::pair<std::pai... | libllama.so | 0.01 | NA | 0.01 | NA | 0.00 | NA | 0.00 | NA | 0.04 | NA | 0.04 | NA | 1 | NA | 1 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA |
| __kmp_get_global_thread_id_reg | libomp.so | 0.01 | NA | 0.01 | NA | 0.00 | NA | 0.00 | NA | 0.01 | NA | 0.01 | NA | 6 | NA | 6 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA |
| __powf_finite | libm.so.6 | NA | 0.01 | NA | 0.01 | NA | 0.00 | NA | 0.00 | NA | 0.01 | NA | 0.00 | NA | 4 | NA | 6 | NA | 0.04 | NA | 0.00 | NA | 0.00 | NA | 0.00 |
| malloc_consolidate | libc.so.6 | 0.00 | 0.00 | 0.00 | 0.01 | 0.00 | 0.00 | 0.00 | 0.00 | 0.01 | 0.02 | 0.01 | 0.02 | 1 | 1 | 1 | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| ggml_compute_forward_rope | libggml-cpu.so | 0.01 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.01 | 0.01 | 0.00 | 0.00 | 7 | 2 | 1 | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| llama_vocab::~llama_vocab() | libllama.so | NA | 0.01 | NA | 0.01 | NA | 0.00 | NA | 0.00 | NA | 0.03 | NA | 0.02 | NA | 1 | NA | 1 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.00 |
| ggml_can_repeat | libggml-base.so | 0.00 | NA | 0.01 | 0.00 | 0.00 | NA | 0.00 | 0.00 | 0.01 | NA | 0.01 | 0.01 | 3 | NA | 4 | 2 | 0.00 | NA | 0.04 | 0.00 | 0.00 | NA | 0.00 | 0.00 |
| ggml_cpu_get_sve_cnt | libggml-cpu.so | 0.01 | 0.00 | NA | NA | 0.00 | 0.00 | NA | NA | 0.01 | 0.01 | NA | NA | 6 | 3 | NA | NA | 0.04 | 0.01 | NA | NA | 0.00 | 0.00 | NA | NA |
| std::__detail::_Map_base<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int>, st... | libllama.so | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.01 | 0.01 | 0.01 | 0.01 | 1 | 1 | 1 | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| __kmp_resume_if_soft_paused | libomp.so | 0.01 | NA | 0.01 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 4 | NA | 5 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA |
| quantize_row_q8_K_ref | libggml-base.so | 0.01 | NA | 0.00 | 0.00 | 0.00 | NA | 0.00 | 0.00 | 0.00 | NA | 0.01 | 0.01 | 5 | NA | 2 | 2 | 0.00 | NA | 0.00 | 0.00 | 0.00 | NA | 0.00 | 0.00 |
| ggml_kleidiai_select_kernels(cpu_feature, ggml_tensor const*) | libggml-cpu.so | 0.00 | NA | 0.01 | 0.00 | 0.00 | NA | 0.00 | 0.00 | 0.00 | NA | 0.01 | 0.01 | 1 | NA | 3 | 3 | 0.00 | NA | 0.05 | 0.00 | 0.00 | NA | 0.00 | 0.00 |
| __fs_log_1 | libamath.so | 0.00 | NA | 0.01 | NA | 0.00 | NA | 0.00 | NA | 0.01 | NA | 0.01 | NA | 2 | NA | 4 | NA | 0.00 | NA | 0.04 | NA | 0.00 | NA | 0.00 | NA |
| ggml_critical_section_start | libggml-base.so | 0.00 | 0.01 | 0.00 | NA | 0.00 | 0.00 | 0.00 | NA | 0.00 | 0.01 | 0.01 | NA | 2 | 4 | 1 | NA | 0.00 | 0.00 | 0.00 | NA | 0.00 | 0.00 | 0.00 | NA |
| __libc_malloc | libc.so.6 | 0.00 | 0.00 | NA | 0.00 | 0.00 | 0.00 | NA | 0.00 | 0.01 | 0.01 | NA | 0.01 | 1 | 1 | NA | 1 | 0.00 | 0.00 | NA | 0.00 | 0.00 | 0.00 | NA | 0.00 |
| ggml_barrier | libggml-cpu.so | NA | 0.00 | 0.00 | 0.01 | NA | 0.00 | 0.00 | 0.00 | NA | 0.00 | 0.00 | 0.01 | NA | 1 | 1 | 4 | NA | 0.00 | 0.00 | 0.00 | NA | 0.00 | 0.00 | 0.00 |
| ggml_compute_forward_dup | libggml-cpu.so | NA | 0.00 | 0.00 | 0.00 | NA | 0.00 | 0.00 | 0.00 | NA | 0.00 | 0.00 | 0.01 | NA | 2 | 1 | 3 | NA | 0.00 | 0.00 | 0.00 | NA | 0.00 | 0.00 | 0.00 |
| ggml_backend_cpu_kleidiai_buffer_type | libggml-cpu.so | NA | 0.00 | 0.00 | 0.00 | NA | 0.00 | 0.00 | 0.00 | NA | 0.00 | 0.00 | 0.00 | NA | 2 | 1 | 3 | NA | 0.00 | 0.00 | 0.00 | NA | 0.00 | 0.00 | 0.00 |
| std::_Hashtable<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::pair<std::pai... | libllama.so | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.02 | NA | 0.01 | NA | 1 | NA | 1 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.00 |
| kai_get_lhs_packed_offset_lhs_quant_pack_qsi8d32p4x8sb_f32_neon | libggml-cpu.so | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.01 | 0.00 | 0.00 | 2 | 2 | 1 | 1 | 0.01 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| std::_Hashtable<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::pair<std::pai... | libllama.so | 0.01 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.02 | NA | 0.01 | NA | 1 | NA | 1 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA |
| _int_free | libc.so.6 | NA | 0.00 | 0.00 | 0.00 | NA | 0.00 | 0.00 | 0.00 | NA | 0.00 | 0.02 | 0.01 | NA | 1 | 1 | 1 | NA | 0.00 | 0.00 | 0.00 | NA | 0.00 | 0.00 | 0.00 |
| syscall | libc.so.6 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 2 | NA | 3 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.00 |
| kai_get_rhs_packed_offset_matmul_clamp_f32_qsi8d32p4x8_qsi4c32p4x8_16x4_neon_i8mm | libggml-cpu.so | 0.00 | 0.00 | NA | 0.00 | 0.00 | 0.00 | NA | 0.00 | 0.00 | 0.00 | NA | 0.00 | 1 | 1 | NA | 3 | 0.00 | 0.00 | NA | 0.00 | 0.00 | 0.00 | NA | 0.00 |
| $x | libc.so.6 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.01 | 0.01 | 1 | 1 | 1 | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::find(char const*, unsigned long, unsigned long) const | libstdc++.so.6.0.33 | 0.00 | 0.00 | 0.00 | NA | 0.00 | 0.00 | 0.00 | NA | 0.00 | 0.00 | 0.02 | NA | 1 | 1 | 1 | NA | 0.00 | 0.00 | 0.00 | NA | 0.00 | 0.00 | 0.00 | NA |
| ggml_rope_yarn_corr_dims | libggml-base.so | 0.00 | 0.00 | NA | 0.00 | 0.00 | 0.00 | NA | 0.00 | 0.00 | 0.00 | NA | 0.00 | 2 | 1 | NA | 2 | 0.01 | 0.00 | NA | 0.00 | 0.00 | 0.00 | NA | 0.00 |
| __memcmpeq | libc.so.6 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.01 | NA | 1 | NA | 1 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.00 |
| ggml_compute_forward_get_rows | libggml-cpu.so | NA | 0.00 | 0.00 | NA | NA | 0.00 | 0.00 | NA | NA | 0.00 | 0.01 | NA | NA | 2 | 2 | NA | NA | 0.00 | 0.00 | NA | NA | 0.00 | 0.00 | NA |
| dequantize_row_q4_0 | libggml-base.so | 0.00 | NA | 0.00 | 0.00 | 0.00 | NA | 0.00 | 0.00 | 0.00 | NA | 0.00 | 0.00 | 1 | NA | 1 | 2 | 0.00 | NA | 0.00 | 0.00 | 0.00 | NA | 0.00 | 0.00 |
| std::ostream::put(char) | libstdc++.so.6.0.33 | 0.00 | 0.00 | NA | 0.00 | 0.00 | 0.00 | NA | 0.00 | 0.00 | 0.00 | NA | 0.01 | 1 | 1 | NA | 1 | 0.00 | 0.00 | NA | 0.00 | 0.00 | 0.00 | NA | 0.00 |
| std::basic_ostream<char, std::char_traits<char> >& std::__ostream_insert<char, std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*, long) | libstdc++.so.6.0.33 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.01 | 1 | 1 | 1 | 1 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| __libc_free | libc.so.6 | 0.00 | 0.00 | NA | NA | 0.00 | 0.00 | NA | NA | 0.01 | 0.01 | NA | NA | 1 | 1 | NA | NA | 0.00 | 0.00 | NA | NA | 0.00 | 0.00 | NA | NA |
| __kmp_task_team_sync | libomp.so | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.01 | NA | 0.00 | NA | 3 | NA | 1 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA |
| std::_Hashtable<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::pair<std::pai... | libllama.so | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.01 | NA | 0.00 | NA | 1 | NA | 1 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA |
| kai_get_mr_matmul_clamp_f32_qsi8d32p4x8_qsi4c32p4x8_16x4_neon_i8mm | libggml-cpu.so | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.01 | NA | NA | NA | 3 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 |
| unknown_function | libllama.so | 0.00 | NA | NA | 0.00 | 0.00 | NA | NA | 0.00 | 0.00 | NA | NA | 0.01 | 1 | NA | NA | 1 | 0.00 | NA | NA | 0.00 | 0.00 | NA | NA | 0.00 |
| _int_malloc | libc.so.6 | 0.00 | NA | NA | 0.00 | 0.00 | NA | NA | 0.00 | 0.00 | NA | NA | 0.01 | 1 | NA | NA | 1 | 0.00 | NA | NA | 0.00 | 0.00 | NA | NA | 0.00 |
| unlink_chunk.isra.0 | libc.so.6 | 0.00 | NA | 0.00 | 0.00 | 0.00 | NA | 0.00 | 0.00 | 0.00 | NA | 0.00 | 0.01 | 1 | NA | 1 | 1 | 0.00 | NA | 0.00 | 0.00 | 0.00 | NA | 0.00 | 0.00 |
| std::_Hashtable<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int>, std::alloca... | libllama.so | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.01 | NA | 1 | NA | 1 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA |
| std::ostream::sentry::sentry(std::ostream&) | libstdc++.so.6.0.33 | 0.00 | 0.00 | NA | 0.00 | 0.00 | 0.00 | NA | 0.00 | 0.00 | 0.00 | NA | 0.01 | 1 | 1 | NA | 1 | 0.00 | 0.00 | NA | 0.00 | 0.00 | 0.00 | NA | 0.00 |
| llama_kv_cache::set_input_kq_mask(ggml_tensor*, llama_ubatch const*, bool) const | libllama.so | 0.00 | 0.00 | 0.00 | NA | 0.00 | 0.00 | 0.00 | NA | 0.00 | 0.00 | 0.00 | NA | 1 | 1 | 1 | NA | 0.00 | 0.00 | 0.00 | NA | 0.00 | 0.00 | 0.00 | NA |
| __vfscanf_internal | libc.so.6 | 0.00 | 0.00 | NA | NA | 0.00 | 0.00 | NA | NA | 0.00 | 0.01 | NA | NA | 1 | 1 | NA | NA | 0.00 | 0.00 | NA | NA | 0.00 | 0.00 | NA | NA |
| unicode_cpts_from_utf8(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) | libllama.so | 0.00 | NA | NA | 0.00 | 0.00 | NA | NA | 0.00 | 0.01 | NA | NA | 0.01 | 1 | NA | NA | 1 | 0.00 | NA | NA | 0.00 | 0.00 | NA | NA | 0.00 |
| ggml_graph_compute.omp_outlined | libggml-cpu.so | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 2 | NA | 1 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA |
| gguf_kv_to_str[abi:cxx11](gguf_context const*, int) | libllama.so | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.01 | NA | 0.00 | NA | 1 | NA | 1 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA |
| std::_Function_handler<unsigned long (unsigned long, unsigned long, unsigned long), unsigned long (*)(unsigned long, unsigned long, unsigned long)>::_M_invoke(std::_Any_data const&, unsigned long&&, unsigned long&&, unsigned long&... | libggml-cpu.so | 0.00 | 0.00 | NA | NA | 0.00 | 0.00 | NA | NA | 0.00 | 0.00 | NA | NA | 2 | 1 | NA | NA | 0.00 | 0.00 | NA | NA | 0.00 | 0.00 | NA | NA |
| bool std::operator==<char, std::char_traits<char>, std::allocator<char> >(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, char const*) | binary | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.01 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 |
| __strlen | libastring.so | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.01 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA |
| std::basic_streambuf<char, std::char_traits<char> >::xsputn(char const*, long) | libstdc++.so.6.0.33 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.01 | NA | 1 | NA | 1 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.00 |
| std::_Hashtable<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int>, std::alloca... | libllama.so | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.01 | NA | 1 | NA | 1 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.00 |
| ggml_is_contiguous_0 | libggml-base.so | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 1 | NA | 1 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.00 |
| std::_Function_handler<unsigned long (unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long), unsigned long (*)(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long)>::_M_invoke(s... | libggml-cpu.so | NA | 0.00 | 0.00 | NA | NA | 0.00 | 0.00 | NA | NA | 0.00 | 0.00 | NA | NA | 1 | 1 | NA | NA | 0.00 | 0.00 | NA | NA | 0.00 | 0.00 | NA |
| llama_vocab::impl::token_to_piece_for_cache[abi:cxx11](int, bool) const | libllama.so | NA | 0.00 | 0.00 | NA | NA | 0.00 | 0.00 | NA | NA | 0.00 | 0.00 | NA | NA | 1 | 1 | NA | NA | 0.00 | 0.00 | NA | NA | 0.00 | 0.00 | NA |
| _IO_sgetn | libc.so.6 | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.01 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA |
| std::_Function_handler<void (unsigned long, unsigned long, unsigned long, unsigned long, void const*, void const*, float*, unsigned long, unsigned long, float, float), void (*)(unsigned long, unsigned long, unsigned long, unsigned long, void const*, voi... | libggml-cpu.so | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 2 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA |
| ggml_get_glu_op | libggml-base.so | 0.00 | NA | NA | 0.00 | 0.00 | NA | NA | 0.00 | 0.00 | NA | NA | 0.00 | 1 | NA | NA | 1 | 0.00 | NA | NA | 0.00 | 0.00 | NA | NA | 0.00 |
| ggml_nrows | libggml-base.so | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.01 | NA | 1 | NA | 1 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA |
| __memchr | libastring.so | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 1 | NA | 1 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA |
| ggml_is_contiguous | libggml-base.so | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 1 | NA | 1 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA |
| llama_kv_cache::prepare(std::vector<llama_ubatch, std::allocator<llama_ubatch> > const&) | libllama.so | 0.00 | 0.00 | NA | NA | 0.00 | 0.00 | NA | NA | 0.00 | 0.00 | NA | NA | 1 | 1 | NA | NA | 0.00 | 0.00 | NA | NA | 0.00 | 0.00 | NA | NA |
| std::pair<std::__detail::_Node_iterator<std::pair<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<c... | libllama.so | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.01 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA |
| __kmp_task_team_setup | libomp.so | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.01 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA |
| llama_vocab::impl::token_get_attr(int) const | libllama.so | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.01 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 |
| kai_get_dst_offset_matmul_clamp_f32_qsi8d32p4x8_qsi4c32p4x8_16x4_neon_i8mm | libggml-cpu.so | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.01 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 |
| _IO_fread | libc.so.6 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.01 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 |
| void std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char*>(char*, char*, std::forward_iterator_tag) [clone .isra.0] | libggml-base.so | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.01 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 |
| llama_kv_cache::find_slot(llama_ubatch const&, bool) const | libllama.so | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.01 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 |
| __printf_buffer | libc.so.6 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.01 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 |
| kai_get_n_step_matmul_clamp_f32_qsi8d32p4x8_qsi4c32p4x8_16x4_neon_i8mm | libggml-cpu.so | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 |
| ggml_graph_compute._omp_fn.0 | libggml-cpu.so | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.01 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 |
| gomp_barrier_wait | libgomp.so.1.0.0 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 |
| replace_all(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<... | libllama.so | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA |
| _IO_file_xsgetn | libc.so.6 | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA |
| __vsnprintf | libc.so.6 | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA |
| std::__detail::_Prime_rehash_policy::_M_need_rehash(unsigned long, unsigned long, unsigned long) const | libstdc++.so.6.0.33 | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA |
| llama_model_loader::load_all_data(ggml_context*, std::unordered_map<unsigned int, ggml_backend_buffer*, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int const, ggml_backend_buffer*> > &g... | libllama.so | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA |
| __kmp_finish_implicit_task | libomp.so | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA |
| __kmp_invoke_microtask | libomp.so | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA |
| __kmp_init_implicit_task | libomp.so | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.01 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA |
| std::_Hash_bytes(void const*, unsigned long, unsigned long) | libstdc++.so.6.0.33 | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA |
| __bcmp | libastring.so | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA |
| __kmp_enter_single | libomp.so | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA |
| $x | ld-linux-aarch64.so.1 | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA |
| llama_context::decode(llama_batch const&) | libllama.so | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA |
| std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_replace_aux(unsigned long, unsigned long, unsigned long, char) | libstdc++.so.6.0.33 | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA |
| llama_vocab::impl::token_to_piece(int, char*, int, int, bool) const | libllama.so | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA |
| unicode_utf8_to_byte(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) | libllama.so | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA |
| gguf_get_arr_str | libggml-base.so | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA |
| std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::resize(unsigned long, char) | libstdc++.so.6.0.33 | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA |
| __logf_finite | libm.so.6 | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA |
| ggml_backend_sched_split_graph | libggml-base.so | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA |
| std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::push_back(char) | libstdc++.so.6.0.33 | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA |
| gomp_team_start | libgomp.so.1.0.0 | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA |
| operator new(unsigned long) | libstdc++.so.6.0.33 | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA |
| ggml_blck_size | libggml-base.so | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA |
| _dl_lookup_symbol_x | ld-linux-aarch64.so.1 | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA |
| gomp_thread_start | libgomp.so.1.0.0 | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA |
| std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::find(char, unsigned long) const | libstdc++.so.6.0.33 | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA |
| ggml_are_same_shape | libggml-base.so | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA |
| __kmp_affinity_create_cpuinfo_map(int*, kmp_i18n_id*) | libomp.so | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA |
| _ZNSt4pairINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES5_EC2IRS5_S8_TnNSt9enable_ifIXaaclsr5_PCCPE22_MoveConstructiblePairIT_T0_EEclsr5_PCCPE30_ImplicitlyMoveConvertiblePairISA_SB_EEEbE4typeELb1EEEOSA_OSB_ | libllama.so | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA |
| common_init()::$_0::__invoke(ggml_log_level, char const*, void*) | binary | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA |
| bool gguf_read_emplace_helper<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >(gguf_reader const&, std::vector<gguf_kv, std::allocator<gguf_kv> >&, std::__cxx11::basic_string<c... | libggml-base.so | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA |
| @plt_start@ | libstdc++.so.6.0.33 | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA |
| ggml::cpu::kleidiai::tensor_traits::compute_forward(ggml_compute_params*, ggml_tensor*) | libggml-cpu.so | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA |
| _dl_relocate_object | ld-linux-aarch64.so.1 | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA |
| __kmp_invoke_task_func | libomp.so | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA |
| __kmpc_global_thread_num | libomp.so | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.01 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA |
| __kmp_fork_call | libomp.so | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA |
| __pthread_mutex_unlock | libc.so.6 | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA |