options

Loops Index

Columns Filter

Level Max Thread Time / Walltime gcc_3 (%) Exclusive Coverage gcc_3 (%) Inclusive Coverage gcc_3 (%) Max Exclusive Time Over Threads gcc_3 (s) Max Inclusive Time Over Threads gcc_3 (s) Exclusive Time w.r.t. Wall Time gcc_3 (s) Inclusive Time w.r.t. Wall Time gcc_3 (s) Nb Threads gcc_3 Vectorization Ratio (%) Vector Length Use (%) Speedup If No Scalar Integer Speedup If FP Vectorized Speedup If Fully Vectorized Speedup If Perfect Load Balancing gcc_3 Stride 0 Stride 1 Stride n Stride Unknown Stride Indirect Array Access Efficiency Level Max Thread Time / Walltime Exclusive Coverage Inclusive Coverage Max Exclusive Time Over Threads Max Inclusive Time Over Threads Exclusive Time w.r.t. Wall Time Inclusive Time w.r.t. Wall Time Nb Threads Vectorization Ratio Vector Length Use Speedup If No Scalar Integer Speedup If FP Vectorized Speedup If Fully Vectorized Speedup If Perfect Load Balancing Stride 0 Stride 1 Stride n Stride Unknown Stride Indirect Array Access Efficiency
Loop idSource LocationSource FunctionLevelMax Thread Time / Walltime gcc_3 (%)Exclusive Coverage gcc_3 (%)Inclusive Coverage gcc_3 (%)Max Exclusive Time Over Threads gcc_3 (s)Max Inclusive Time Over Threads gcc_3 (s)Exclusive Time w.r.t. Wall Time gcc_3 (s)Inclusive Time w.r.t. Wall Time gcc_3 (s)Nb Threads gcc_3Vectorization Ratio (%)Vector Length Use (%)Speedup If No Scalar IntegerSpeedup If FP VectorizedSpeedup If Fully VectorizedSpeedup If Perfect Load Balancing gcc_3Stride 0Stride 1Stride nStride UnknownStride IndirectArray Access Efficiency
351libggml-cpu.so - quants.c:322-329 [...]ggml_vec_dot_q8_0_q8_0Single62.6166.1166.118.048.047.927.929653.3361.671.052.091.481.040042066.67
74libggml-cpu.so - ggml-cpu.c:2879-2898 [...]ggml_graph_compute_threadInnermost0.470.200.200.060.060.020.0293048.33111.282.51NANANANANA0.00
1613libggml-cpu.so - ops.cpp:8778-8939 [...]ggml_compute_forward_flash_attn_extInBetween0.740.170.240.100.120.020.033242.4557.64.331.031.21.63NANANANANA0.00
788libggml-cpu.so - vec.cpp:311-311 [...]ggml_vec_dot_f16Single0.930.160.160.120.120.020.02321001001112.180046060.00
75libggml-cpu.so - ggml-cpu.c:1664-2898 [...]ggml_graph_compute_threadOutermost0.270.140.340.040.090.020.0494037.51142.04NANANANANA0.00
573libggml-base.so - ggml-quants.c:203-219 [...]quantize_row_q8_0_refInnermost0.270.120.120.040.040.010.018833.3350.152.191.021.382.3NANANANANA0.00
64libggml-cpu.so - ggml-cpu.c:1162-1198 [...]ggml_compute_forward_mul_matInBetween0.270.110.110.040.040.010.0187055.94111.52.46NANANANANA0.00
820libggml-cpu.so - ops.cpp:6210-6245 [...]ggml_compute_forward_rope_f32(ggml_compute_params const*, ggml_tensor*, bool)Innermost0.270.070.070.040.040.010.01762.2226.941.661.483.883.22NANANANANA0.00
65libggml-cpu.so - ggml-cpu.c:1125-1397 [...]ggml_compute_forward_mul_matOutermost0.190.050.250.030.060.010.036401001112.88NANANANANA0.00
1622libggml-cpu.so - vec.h:491-491 [...]ggml_compute_forward_flash_attn_extInnermost0.310.050.050.040.040.010.01321001001112.351044066.67
60libggml-cpu.so - ggml-cpu.c:1193-1194ggml_compute_forward_mul_matInnermost0.160.040.040.020.020.010.0159045112.942.5110000100.00
66libggml-cpu.so - ggml-cpu.c:1125-1395 [...]ggml_compute_forward_mul_matInnermost0.160.040.040.020.020.000.0056058.15111.752.46NANANANANA0.00
794libggml-cpu.so - vec.cpp:390-390 [...]ggml_vec_swiglu_f32Innermost2.340.030.030.300.300.000.00193.9498.481.0711103000100.00
785libggml-cpu.so - vec.cpp:311-337 [...]ggml_vec_dot_f16Outermost0.230.020.020.030.030.000.002031.5449.522.11.141.642.5NANANANANA0.00
72libggml-cpu.so - ggml-cpu.c:1290-1297ggml_compute_forward_mul_matInBetween0.080.010.010.010.010.000.0025053.62111.981.921001075.00
1612libggml-cpu.so - ops.cpp:8818-8826 [...]ggml_compute_forward_flash_attn_extInnermost0.080.010.010.010.010.000.0018042.8610.67141.8NANANANANA0.00
816libggml-cpu.so - ops.cpp:6210-6407 [...]ggml_compute_forward_rope_f32(ggml_compute_params const*, ggml_tensor*, bool)InBetween0.080.010.080.010.030.000.01170.6836.62.581.31.81.79NANANANANA0.00
815libggml-cpu.so - ops.cpp:6210-6475 [...]ggml_compute_forward_rope_f32(ggml_compute_params const*, ggml_tensor*, bool)InBetween0.080.010.090.010.030.000.0117032.412.041.511.661.89NANANANANA0.00
830libggml-cpu.so - ops.cpp:6446-6457 [...]ggml_compute_forward_rope_f32(ggml_compute_params const*, ggml_tensor*, bool)Innermost0.120.010.010.010.010.000.001342.8657.141122.1710015053.13
1615libggml-cpu.so - ops.cpp:8778-8939 [...]ggml_compute_forward_flash_attn_extInBetween0.120.010.010.010.010.000.0012043.72.481.241.822.25NANANANANA0.00
515libggml-cpu.so - binary-ops.cpp:18-32 [...]ggml_compute_forward_mulInnermost0.550.010.010.070.070.000.0021001001111.870030075.00
821libggml-cpu.so - ops.cpp:6210-6490 [...]ggml_compute_forward_rope_f32(ggml_compute_params const*, ggml_tensor*, bool)Outermost0.080.010.100.010.030.000.011119.2343.271.5611.51.69NANANANANA0.00
4449exec - sampling.cpp:125-126 [...]common_sampler_sample(common_sampler*, llama_context*, int, bool)Innermost0.510.010.010.060.060.000.00116.1332.261.51410029054.55
1libggml-cpu.so - ggml-cpu.c:3228-3229 [...]ggml_cpu_fp32_to_fp16Single0.080.010.010.010.010.000.00121001001111.8520000100.00
572libggml-base.so - ggml-quants.c:203-219 [...]quantize_row_q8_0_refInBetween0.040.000.120.010.030.000.01110251141NANANANANA0.00
431libggml-cpu.so - binary-ops.cpp:10-32 [...]ggml_compute_forward_add_non_quantizedInnermost0.390.000.000.050.050.000.00110010011110030075.00
1617libggml-cpu.so - vec.h:677-677 [...]ggml_compute_forward_flash_attn_extInnermost0.080.000.000.010.010.000.0051001001111.250013056.25
1193libggml-cpu.so - ops.cpp:4325-4326ggml_compute_forward_rms_normInnermost0.230.000.000.030.030.000.00110091.11111.1110002050.00
103libggml-cpu.so - ggml-cpu.cpp:61-64ggml_backend_cpu_get_extra_buffer_types()Single0.080.000.000.010.010.000.00501001111.67NANANANANA0.00
62libggml-cpu.so - ggml-cpu.c:1397-1397ggml_compute_forward_mul_matInnermost0.040.000.000.010.010.000.006NANANANANA1NANANANANA0.00
67libggml-cpu.so - ggml-cpu.c:1132-1165 [...]ggml_compute_forward_mul_matInnermost0.040.000.000.000.000.000.006045.311121NANANANANA0.00
3550libllama.so - stl_algo.h:1594-1595 [...]llama_token_data_array_partial_sort_inplace(llama_token_data_array*, int)Innermost0.190.000.000.020.020.000.001030.832.1713.0611.50.50.50093.75
1614libggml-cpu.so - ops.cpp:8778-8939 [...]ggml_compute_forward_flash_attn_extInBetween0.080.000.010.010.010.000.00345.8664.574.461.021.081.5NANANANANA0.00
1611libggml-cpu.so - ops.cpp:8778-8939 [...]ggml_compute_forward_flash_attn_extOutermost0.040.000.240.000.090.000.033030.773.221.344.251NANANANANA0.00
1197libggml-cpu.so - vec.h:677-677 [...]ggml_compute_forward_rms_normInnermost0.080.000.000.010.010.000.00110010011110013056.25
4616libllama.so - basic_string.h:1077-3760 [...]std::_Hashtable<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, unsigned char>, s...Innermost0.080.000.000.010.010.000.00105011210.67002061.11
3847libllama.so - basic_string.h:1077-3760 [...]std::_Hashtable<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::pair<std::pai...Single0.080.000.000.010.010.000.0010501121NANANANANA0.00
238libggml-cpu.so - repack.cpp:1979-1982ggml_backend_cpu_repack_buffer_type()Single0.040.000.000.000.000.000.0020751111NANANANANA0.00
35libggml-base.so - ggml.c:1363-1388 [...]ggml_is_contiguous_0Single0.040.000.000.000.000.000.002048.531121NANANANANA0.00
3912libllama.so - hashtable_policy.h:387-2058 [...]llama_vocab::~llama_vocab()Single0.080.000.000.010.010.000.00105011210002050.00
4210libllama.so - basic_string.h:1077-3760 [...]std::__detail::_Map_base<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int>, st...Single0.040.000.000.000.000.000.00105011210.67002061.11
1693libllama.so - basic_string.h:198-2708 [...]replace_all(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<...Outermost0.040.000.000.000.000.000.001060.21111.361NANANANANA0.00
1281libggml-cpu.so - ops.cpp:5551-5563ggml_compute_forward_set_rowsInBetween0.040.000.000.000.000.000.001059.64111.581NANANANANA66.67
1903libllama.so - llama-kv-cache.cpp:1239-1261 [...]llama_kv_cache::set_input_kq_mask(ggml_tensor*, llama_ubatch const*, bool) constInnermost0.040.000.000.000.000.000.001034.774.2515.231NANANANANA0.00
330exec - stl_uninitialized.h:119-119 [...]std::vector<std::__cxx11::sub_match<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, std::allocator<std::__cxx11::sub_match<__gnu_cxx::__no...Outermost0.040.000.000.000.000.000.001056.5111.51NANANANANA0.00
817libggml-cpu.so - ops.cpp:6210-6439 [...]ggml_compute_forward_rope_f32(ggml_compute_params const*, ggml_tensor*, bool)InBetween0.040.000.080.000.020.000.011053.19111.661NANANANANA0.00
1624libggml-cpu.so - vec.h:740-740 [...]ggml_compute_forward_flash_attn_extInnermost0.040.000.000.000.000.000.00110010011110013056.25
334exec - stl_algobase.h:401-405 [...]std::vector<std::__cxx11::sub_match<__gnu_cxx::__normal_iterator<char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, std::allocator<std::__cxx11::sub_match<__gnu_cxx::__no...Single0.040.000.000.000.000.000.001036112.8210032065.00
1905libllama.so - stl_algobase.h:951-952llama_kv_cache::set_input_kq_mask(ggml_tensor*, llama_ubatch const*, bool) constSingle0.040.000.000.000.000.000.00110010011110011062.50
4053libllama.so - char_traits.h:381-381 [...]llama_vocab::impl::load(llama_model_loader&, LLM_KV const&)Single0.040.000.000.000.000.000.00111.1136.57315.451NANANANANA0.00
4419libllama.so - unicode.cpp:799-811 [...]unicode_cpts_from_utf8(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)Innermost0.040.000.000.000.000.000.001061.111111NANANANANA0.00
4238libllama.so - hashtable.h:2627-2644 [...]std::_Hashtable<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::pair<std::pai...Innermost0.040.000.000.000.000.000.00105011210004140.00
1171libggml-base.so - stl_uninitialized.h:642-642 [...]bool gguf_read_emplace_helper<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >(gguf_reader const&, std::vector<gguf_kv, std::allocator<gguf_kv> >&, std::__cxx11::basic_string<c...Single0.040.000.000.000.000.000.00105011210011062.50
4071libllama.so - llama-vocab.cpp:1575-2004 [...]llama_vocab::impl::load(llama_model_loader&, LLM_KV const&)Outermost0.040.000.000.000.000.000.001048.68111.331NANANANANA0.00
1195libggml-cpu.so - ops.cpp:4319-4365 [...]ggml_compute_forward_rms_normOutermost0.040.000.000.000.000.000.001NANANANANA1NANANANANA0.00
44exec - main.cpp:663-681 [...]mainInnermost0.040.000.000.000.000.000.0018.5141.497.511.321NANANANANA0.00
×