options

Loops Index

75 loops have been discarded from the report because their coverage is lower than the threshold set by object_coverage_threshold (0.01%). It represents about 1.47% of the application. To include them, change the value of object_coverage_threshold in the experiment directory configuration file, then rerun the command with the additionnal parameter --force-static-analysis

Columns Filter

Level Exclusive Coverage icx_10 (%) Inclusive Coverage icx_10 (%) Max Exclusive Time Over Threads icx_10 (s) Max Inclusive Time Over Threads icx_10 (s) Exclusive Time w.r.t. Wall Time icx_10 (s) Inclusive Time w.r.t. Wall Time icx_10 (s) Nb Threads icx_10 GFLOPS icx_10 Vectorization Ratio (%) Vector Length Use (%) Speedup If No Scalar Integer Speedup If FP Vectorized Speedup If Fully Vectorized Speedup If Perfect Load Balancing icx_10 Stride 0 Stride 1 Stride n Stride Unknown Stride Indirect Array Access Efficiency
Loop idSource LocationSource FunctionLevelExclusive Coverage icx_10 (%)Inclusive Coverage icx_10 (%)Max Exclusive Time Over Threads icx_10 (s)Max Inclusive Time Over Threads icx_10 (s)Exclusive Time w.r.t. Wall Time icx_10 (s)Inclusive Time w.r.t. Wall Time icx_10 (s)Nb Threads icx_10GFLOPS icx_10Vectorization Ratio (%)Vector Length Use (%)Speedup If No Scalar IntegerSpeedup If FP VectorizedSpeedup If Fully VectorizedSpeedup If Perfect Load Balancing icx_10Stride 0Stride 1Stride nStride UnknownStride IndirectArray Access Efficiency
322libggml-cpu.so - mmq.cpp:1573-1597 [...]ggml_backend_amx_mul_mat(ggml_compute_params const*, ggml_tensor*)::{lambda(int, int)#2}::operator()(int, int) const::{lambda()#1}::operator()() constSingle84.9084.9024.6324.6324.0124.0119289.34NANANANANA1.05NANANANANA0.00
2663libggml-cpu.so - quants.c:298-355 [...]quantize_row_q8_0Single0.030.030.350.350.010.0161043.6760.729.6611.342.741.3902000100.00
×