options

Functions and Loops

Columns Filter

Max Thread Time / Walltime orig_0 (%) Coverage orig_0 (%) Coverage Excluding Loops orig_0 (%) Max Inclusive Time Over Threads orig_0 (s) Max Exclusive Time Over Threads orig_0 (s) Inclusive Time w.r.t. Wall Time orig_0 (s) Exclusive Time w.r.t. Wall Time orig_0 (s) Nb Threads orig_0 Deviation (coverage) orig_0 Deviation (walltime) orig_0 Categories orig_0 Compilation Options Max Thread Time / Walltime Coverage Coverage Excluding Loops Max Inclusive Time Over Threads Max Exclusive Time Over Threads Inclusive Time w.r.t. Wall Time Exclusive Time w.r.t. Wall Time Nb Threads Deviation (coverage) Deviation (walltime) Categories Compilation Options
NameModuleMax Thread Time / Walltime orig_0 (%)Coverage orig_0 (%)Coverage Excluding Loops orig_0 (%)Max Inclusive Time Over Threads orig_0 (s)Max Exclusive Time Over Threads orig_0 (s)Inclusive Time w.r.t. Wall Time orig_0 (s)Exclusive Time w.r.t. Wall Time orig_0 (s)Nb Threads orig_0Deviation (coverage) orig_0Deviation (walltime) orig_0Categories orig_0Compilation Options
kai_run_matmul_clamp_f32_qsi8d32p4x8_qsi4c32p4x8_16x4_neon_i8mm+libggml-cpu.so30.5146.430.022.260.012.460.00954.880.21/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/build/llama.cpp/../build/bin/libggml-blas.so (%): 100.00Arm C/C++/Fortran Compiler version 24.10.1 (build number 4) (based on LLVM 19.1.0) /opt/arm/arm-linux-compiler-24.10.1_AmazonLinux-2023/llvm-bin/clang-19 -D GGML_BACKEND_BUILD -D GGML_BACKEND_SHARED -D GGML_SCHED_MAX_COPIES=4 -D GGML_SHARED -D GGML_USE_CPU...
Loop 2353 - kai_matmul_clamp_f32_qsi8d32p4x8_qsi4c32p4x8_16x4_neon_i8mm.c:131-131 - libggml-cpu.so+0.000.000.000.000.000.000.0000.000.00
Loop 2352 - kai_matmul_clamp_f32_qsi8d32p4x8_qsi4c32p4x8_16x4_neon_i8mm.c:131-131 - libggml-cpu.so+0.000.000.000.000.000.000.0000.000.00
Loop 2351 - kai_matmul_clamp_f32_qsi8d32p4x8_qsi4c32p4x8_16x4_neon_i8mm.c:131-131 - libggml-cpu.so0.000.000.000.000.000.000.0000.000.00
Loop 2356 - kai_matmul_clamp_f32_qsi8d32p4x8_qsi4c32p4x8_16x4_neon_i8mm.c:131-131 - libggml-cpu.so+0.0046.420.002.280.002.460.0000.000.00
Loop 2355 - kai_matmul_clamp_f32_qsi8d32p4x8_qsi4c32p4x8_16x4_neon_i8mm.c:131-131 - libggml-cpu.so+0.3446.420.172.280.032.460.01730.120.00
Loop 2354 - kai_matmul_clamp_f32_qsi8d32p4x8_qsi4c32p4x8_16x4_neon_i8mm.c:131-131 - libggml-cpu.so30.4446.2546.252.262.262.452.45954.860.21
kmp_flag_64<false, true>::wait(kmp_info*, int, void*)libomp.so38.2527.8627.862.842.841.481.48966.720.27OMP (%): 100.00
ggml_vec_dot_q6_K_q8_K+libggml-cpu.so8.2813.480.080.620.020.710.00960.560.01/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/build/llama.cpp/../build/bin/libggml-blas.so (%): 100.00Arm C/C++/Fortran Compiler version 24.10.1 (build number 4) (based on LLVM 19.1.0) /opt/arm/arm-linux-compiler-24.10.1_AmazonLinux-2023/llvm-bin/clang-19 -D GGML_BACKEND_BUILD -D GGML_BACKEND_SHARED -D GGML_SCHED_MAX_COPIES=4 -D GGML_SHARED -D GGML_USE_CPU...
Loop 2234 - quants.c:2683-2812 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 2237 - quants.c:2492-2660 - libggml-cpu.so [...]+1.3513.401.300.670.100.710.07960.380.02
Loop 2236 - quants.c:2506-2590 - libggml-cpu.so [...]7.6812.1012.100.570.570.640.64960.620.02
Loop 2235 - quants.c:2683-2758 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
__GI___pthread_mutex_locklibc.so.63.372.682.680.250.250.140.14941.100.05Pthread (%): 100.00
kmp_flag_native<unsigned long long, (flag_type)1, true>::notdone_check()libomp.so2.421.831.830.180.180.100.10960.650.03OMP (%): 100.00
ggml_compute_forward_flash_attn_ext+libggml-cpu.so1.080.980.040.080.010.050.00880.350.02/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/build/llama.cpp/../build/bin/libggml-blas.so (%): 100.00Arm C/C++/Fortran Compiler version 24.10.1 (build number 4) (based on LLVM 19.1.0) /opt/arm/arm-linux-compiler-24.10.1_AmazonLinux-2023/llvm-bin/clang-19 --driver-mode=g++ -D GGML_BACKEND_BUILD -D GGML_BACKEND_SHARED -D GGML_SCHED_MAX_COPIES=4 -D GGML_SHAR...
Loop 1713 - vec.h:282-725 - libggml-cpu.so [...]+0.340.940.160.150.030.050.01680.120.00
Loop 1718 - vec.h:411-458 - libggml-cpu.so0.540.320.320.040.040.020.02770.200.01
Loop 1720 - vec.h:710-717 - libggml-cpu.so0.000.000.000.000.000.000.0000.000.00
Loop 1715 - vec.h:290-338 - libggml-cpu.so0.000.000.000.000.000.000.0000.000.00
Loop 1714 - vec.h:343-348 - libggml-cpu.so0.000.000.000.000.000.000.0000.000.00
Loop 1719 - vec.h:710-717 - libggml-cpu.so0.070.000.000.010.010.000.0030.000.00
Loop 1710 - vec.h:282-662 - libggml-cpu.so [...]+0.470.450.310.080.040.020.02830.170.01
Loop 1721 - ops.cpp:8778-8920 - libggml-cpu.so [...]+0.200.150.090.040.020.010.00510.070.00
Loop 1709 - vec.h:474-662 - libggml-cpu.so [...]+0.070.060.010.030.010.000.0090.000.00
Loop 1723 - vec.h:646-653 - libggml-cpu.so0.000.000.000.000.000.000.0000.000.00
Loop 1725 - ops.cpp:8885-8886 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 1724 - ops.cpp:8885-8886 - libggml-cpu.so [...]0.070.000.000.010.010.000.0010.000.00
Loop 1722 - vec.h:646-653 - libggml-cpu.so0.200.050.050.020.020.000.00300.080.00
Loop 1711 - vec.h:343-348 - libggml-cpu.so0.000.000.000.000.000.000.0000.000.00
Loop 1716 - vec.h:646-653 - libggml-cpu.so0.000.000.000.000.000.000.0000.000.00
Loop 1712 - vec.h:290-338 - libggml-cpu.so0.000.000.000.000.000.000.0000.000.00
Loop 1717 - vec.h:461-466 - libggml-cpu.so0.000.000.000.000.000.000.0000.000.00
ggml_compute_forward_rope_f32(ggml_compute_params const*, ggml_tensor*, bool)+libggml-cpu.so1.080.850.000.080.010.040.00960.300.01/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/build/llama.cpp/../build/bin/libggml-blas.so (%): 100.00Arm C/C++/Fortran Compiler version 24.10.1 (build number 4) (based on LLVM 19.1.0) /opt/arm/arm-linux-compiler-24.10.1_AmazonLinux-2023/llvm-bin/clang-19 --driver-mode=g++ -D GGML_BACKEND_BUILD -D GGML_BACKEND_SHARED -D GGML_SCHED_MAX_COPIES=4 -D GGML_SHAR...
Loop 1400 - ops.cpp:6210-6484 - libggml-cpu.so [...]+0.130.840.020.130.010.040.00160.030.00
Loop 1403 - ops.cpp:6446-6456 - libggml-cpu.so [...]0.400.060.060.030.030.000.00280.130.01
Loop 1405 - ops.cpp:6429-6442 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 1401 - ops.cpp:6462-6475 - libggml-cpu.so0.000.000.000.000.000.000.0000.000.00
Loop 1404 - ops.cpp:6413-6426 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 1399 - ops.cpp:6210-6409 - libggml-cpu.so [...]+0.200.770.070.090.020.040.00410.080.00
Loop 1407 - ops.cpp:6220-6245 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 1409 - ops.cpp:6210-6245 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 1408 - ops.cpp:6220-6245 - libggml-cpu.so [...]1.080.700.700.080.080.040.04960.290.01
Loop 1406 - ops.cpp:6210-6303 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 1402 - ops.cpp:6479-6484 - libggml-cpu.so0.000.000.000.000.000.000.0000.000.00
ggml_vec_swiglu_f32+libggml-cpu.so3.500.840.000.260.000.040.00160.780.03/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/build/llama.cpp/../build/bin/libggml-blas.so (%): 100.00Arm C/C++/Fortran Compiler version 24.10.1 (build number 4) (based on LLVM 19.1.0) /opt/arm/arm-linux-compiler-24.10.1_AmazonLinux-2023/llvm-bin/clang-19 --driver-mode=g++ -D GGML_BACKEND_BUILD -D GGML_BACKEND_SHARED -D GGML_SCHED_MAX_COPIES=4 -D GGML_SHAR...
Loop 886 - vec.cpp:385-387 - libggml-cpu.so [...]3.500.840.840.260.260.040.04160.780.03
Loop 884 - vec.cpp:402-405 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 883 - vec.cpp:402-403 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 885 - vec.cpp:403-403 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
__GI___sched_yieldlibc.so.60.880.610.610.070.070.030.03940.300.01OMP (%): 100.00
unknown_function[vdso]0.810.430.000.060.000.020.00910.270.01OMP (%): 100.00
ggml_vec_dot_f16+libggml-cpu.so0.610.420.030.050.010.020.00830.220.01/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/build/llama.cpp/../build/bin/libggml-blas.so (%): 100.00Arm C/C++/Fortran Compiler version 24.10.1 (build number 4) (based on LLVM 19.1.0) /opt/arm/arm-linux-compiler-24.10.1_AmazonLinux-2023/llvm-bin/clang-19 --driver-mode=g++ -D GGML_BACKEND_BUILD -D GGML_BACKEND_SHARED -D GGML_SCHED_MAX_COPIES=4 -D GGML_SHAR...
Loop 878 - vec.cpp:266-269 - libggml-cpu.so0.000.000.000.000.000.000.0000.000.00
Loop 879 - vec.cpp:231-262 - libggml-cpu.so0.540.380.380.040.040.020.02820.210.01
ggml_graph_compute_thread+libggml-cpu.so0.810.350.010.060.010.020.00730.300.01/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/build/llama.cpp/../build/bin/libggml-blas.so (%): 100.00Arm C/C++/Fortran Compiler version 24.10.1 (build number 4) (based on LLVM 19.1.0) /opt/arm/arm-linux-compiler-24.10.1_AmazonLinux-2023/llvm-bin/clang-19 -D GGML_BACKEND_BUILD -D GGML_BACKEND_SHARED -D GGML_SCHED_MAX_COPIES=4 -D GGML_SHARED -D GGML_USE_CPU...
Loop 90 - ggml-cpu.c:2087-2088 - libggml-cpu.so0.000.000.000.000.000.000.0000.000.00
Loop 84 - ggml-cpu.c:1585-1587 - libggml-cpu.so0.000.000.000.000.000.000.0000.000.00
Loop 73 - ggml-cpu.c:533-2897 - libggml-cpu.so [...]+0.810.340.340.060.060.020.02730.300.01
Loop 72 - ggml-cpu.c:533-2897 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 75 - ggml-cpu.c:1436-1642 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 79 - ggml-cpu.c:1436-1465 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 78 - ggml-cpu.c:1436-1465 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 77 - ggml-cpu.c:1438-1465 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 76 - ggml-cpu.c:1454-1462 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 83 - ggml-cpu.c:1436-1465 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 82 - ggml-cpu.c:1436-1465 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 81 - ggml-cpu.c:1438-1465 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 80 - ggml-cpu.c:1461-1462 - libggml-cpu.so0.000.000.000.000.000.000.0000.000.00
Loop 74 - ggml-cpu.c:1592-1601 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 89 - ggml-cpu.c:1552-1560 - libggml-cpu.so+0.000.000.000.000.000.000.0000.000.00
Loop 88 - ggml-cpu.c:1552-1560 - libggml-cpu.so+0.000.000.000.000.000.000.0000.000.00
Loop 87 - ggml-cpu.c:1552-1560 - libggml-cpu.so0.000.000.000.000.000.000.0000.000.00
Loop 86 - ggml-cpu.c:1572-1579 - libggml-cpu.so+0.000.000.000.000.000.000.0000.000.00
Loop 85 - ggml-cpu.c:1573-1579 - libggml-cpu.so0.000.000.000.000.000.000.0000.000.00
__aarch64_ldadd8_acq_rellibomp.so1.140.330.330.090.090.020.02800.320.01OMP (%): 100.00
ggml::cpu::kleidiai::extra_buffer_type::get_tensor_traits(ggml_tensor const*)libggml-cpu.so0.540.280.280.040.040.010.01800.210.01/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/build/llama.cpp/../build/bin/libggml-blas.so (%): 100.00Arm C/C++/Fortran Compiler version 24.10.1 (build number 4) (based on LLVM 19.1.0) /opt/arm/arm-linux-compiler-24.10.1_AmazonLinux-2023/llvm-bin/clang-19 --driver-mode=g++ -D GGML_BACKEND_BUILD -D GGML_BACKEND_SHARED -D GGML_SCHED_MAX_COPIES=4 -D GGML_SHAR...
__kmp_hyper_barrier_release(barrier_type, kmp_info*, int, int, int, void*)libomp.so0.940.270.270.070.070.010.01900.230.01OMP (%): 100.00
__sincosf_finitelibamath.so0.540.250.250.040.040.010.01840.170.01Math (%): 100.00
ggml_compute_forward_mul+libggml-cpu.so0.740.220.120.050.030.010.01710.220.01/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/build/llama.cpp/../build/bin/libggml-blas.so (%): 100.00Arm C/C++/Fortran Compiler version 24.10.1 (build number 4) (based on LLVM 19.1.0) /opt/arm/arm-linux-compiler-24.10.1_AmazonLinux-2023/llvm-bin/clang-19 --driver-mode=g++ -D GGML_BACKEND_BUILD -D GGML_BACKEND_SHARED -D GGML_SCHED_MAX_COPIES=4 -D GGML_SHAR...
Loop 524 - binary-ops.cpp:18-95 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 523 - binary-ops.cpp:18-45 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 485 - binary-ops.cpp:18-95 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 486 - binary-ops.cpp:18-95 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 484 - binary-ops.cpp:18-45 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 497 - binary-ops.cpp:18-110 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 494 - binary-ops.cpp:18-110 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 495 - binary-ops.cpp:18-32 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 493 - binary-ops.cpp:84-84 - libggml-cpu.so0.000.000.000.000.000.000.0000.000.00
Loop 496 - binary-ops.cpp:18-32 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 492 - binary-ops.cpp:18-32 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 515 - binary-ops.cpp:18-110 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 511 - binary-ops.cpp:18-32 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 514 - binary-ops.cpp:18-110 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 512 - binary-ops.cpp:84-84 - libggml-cpu.so0.000.000.000.000.000.000.0000.000.00
Loop 513 - binary-ops.cpp:18-32 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 518 - binary-ops.cpp:18-110 - libggml-cpu.so [...]+0.000.090.000.050.000.010.0000.000.00
Loop 519 - binary-ops.cpp:18-110 - libggml-cpu.so [...]+0.000.090.000.050.000.010.0000.000.00
Loop 520 - binary-ops.cpp:18-32 - libggml-cpu.so [...]0.740.090.090.050.050.010.01160.290.01
Loop 522 - binary-ops.cpp:18-32 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 521 - binary-ops.cpp:18-32 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 509 - binary-ops.cpp:18-110 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 510 - binary-ops.cpp:18-32 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 508 - binary-ops.cpp:18-32 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 507 - binary-ops.cpp:84-101 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 482 - binary-ops.cpp:18-95 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 481 - binary-ops.cpp:18-45 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 483 - binary-ops.cpp:18-45 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 475 - binary-ops.cpp:18-95 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 474 - binary-ops.cpp:18-45 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 477 - binary-ops.cpp:18-95 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 476 - binary-ops.cpp:18-45 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 479 - binary-ops.cpp:18-95 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 480 - binary-ops.cpp:18-95 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 478 - binary-ops.cpp:18-45 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 526 - binary-ops.cpp:18-95 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 525 - binary-ops.cpp:18-45 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 502 - binary-ops.cpp:18-101 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 501 - binary-ops.cpp:18-101 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 500 - binary-ops.cpp:18-32 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 499 - binary-ops.cpp:84-84 - libggml-cpu.so0.000.000.000.000.000.000.0000.000.00
Loop 498 - binary-ops.cpp:18-32 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 487 - binary-ops.cpp:18-110 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 488 - binary-ops.cpp:18-110 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 489 - binary-ops.cpp:18-32 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 491 - binary-ops.cpp:18-32 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 490 - binary-ops.cpp:18-32 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 503 - binary-ops.cpp:18-101 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 505 - binary-ops.cpp:18-101 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 504 - binary-ops.cpp:18-32 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 506 - binary-ops.cpp:18-32 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 517 - binary-ops.cpp:18-95 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 516 - binary-ops.cpp:18-45 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
kai_run_lhs_quant_pack_qsi8d32p4x8sb_f32_neon+libggml-cpu.so3.100.180.000.230.000.010.0040.740.04/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/build/llama.cpp/../build/bin/libggml-blas.so (%): 100.00Arm C/C++/Fortran Compiler version 24.10.1 (build number 4) (based on LLVM 19.1.0) /opt/arm/arm-linux-compiler-24.10.1_AmazonLinux-2023/llvm-bin/clang-19 -D GGML_BACKEND_BUILD -D GGML_BACKEND_SHARED -D GGML_SCHED_MAX_COPIES=4 -D GGML_SHARED -D GGML_USE_CPU...
Loop 2318 - kai_lhs_quant_pack_qsi8d32p4x8sb_f32_neon.c:93-264 - libggml-cpu.so [...]+0.000.180.000.230.000.010.0000.000.00
Loop 2317 - kai_lhs_quant_pack_qsi8d32p4x8sb_f32_neon.c:96-258 - libggml-cpu.so [...]3.100.180.180.230.230.010.0140.710.04
Loop 2315 - kai_lhs_quant_pack_qsi8d32p4x8sb_f32_neon.c:268-335 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 2316 - kai_lhs_quant_pack_qsi8d32p4x8sb_f32_neon.c:271-332 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
__expf_finitelibamath.so0.400.170.170.030.030.010.01710.150.01Math (%): 100.00
__GI___lll_lock_waitlibc.so.60.340.140.140.020.020.010.01620.110.00System (%): 100.00
kai_run_rhs_pack_nxk_qsi4c32pscalef16_qsu4c32s16s0+libggml-cpu.so7.140.130.000.530.000.010.0010.000.00/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/build/llama.cpp/../build/bin/libggml-blas.so (%): 100.00Arm C/C++/Fortran Compiler version 24.10.1 (build number 4) (based on LLVM 19.1.0) /opt/arm/arm-linux-compiler-24.10.1_AmazonLinux-2023/llvm-bin/clang-19 -D GGML_BACKEND_BUILD -D GGML_BACKEND_SHARED -D GGML_SCHED_MAX_COPIES=4 -D GGML_SHARED -D GGML_USE_CPU...
Loop 2336 - kai_rhs_pack_nxk_qsi4c32pscalef16_qsu4c32s16s0.c:107-154 - libggml-cpu.so [...]+0.810.130.010.530.060.010.0010.000.00
Loop 2338 - kai_rhs_pack_nxk_qsi4c32pscalef16_qsu4c32s16s0.c:115-118 - libggml-cpu.so2.090.040.040.160.160.000.0010.000.00
Loop 2337 - kai_rhs_pack_nxk_qsi4c32pscalef16_qsu4c32s16s0.c:127-139 - libggml-cpu.so [...]4.240.080.080.310.310.000.0010.000.00
Loop 2333 - kai_rhs_pack_nxk_qsi4c32pscalef16_qsu4c32s16s0.c:107-154 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 2334 - kai_rhs_pack_nxk_qsi4c32pscalef16_qsu4c32s16s0.c:127-134 - libggml-cpu.so0.000.000.000.000.000.000.0000.000.00
Loop 2335 - kai_rhs_pack_nxk_qsi4c32pscalef16_qsu4c32s16s0.c:115-118 - libggml-cpu.so0.000.000.000.000.000.000.0000.000.00
Loop 2329 - kai_rhs_pack_nxk_qsi4c32pscalef16_qsu4c32s16s0.c:107-154 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 2331 - kai_rhs_pack_nxk_qsi4c32pscalef16_qsu4c32s16s0.c:107-154 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 2332 - kai_rhs_pack_nxk_qsi4c32pscalef16_qsu4c32s16s0.c:115-118 - libggml-cpu.so0.000.000.000.000.000.000.0000.000.00
Loop 2330 - kai_rhs_pack_nxk_qsi4c32pscalef16_qsu4c32s16s0.c:127-142 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 2327 - kai_rhs_pack_nxk_qsi4c32pscalef16_qsu4c32s16s0.c:107-154 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 2328 - kai_rhs_pack_nxk_qsi4c32pscalef16_qsu4c32s16s0.c:107-154 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 2326 - kai_rhs_pack_nxk_qsi4c32pscalef16_qsu4c32s16s0.c:123-154 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 2325 - kai_rhs_pack_nxk_qsi4c32pscalef16_qsu4c32s16s0.c:127-148 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 2324 - kai_rhs_pack_nxk_qsi4c32pscalef16_qsu4c32s16s0.c:145-148 - libggml-cpu.so0.000.000.000.000.000.000.0000.000.00
Loop 2323 - kai_rhs_pack_nxk_qsi4c32pscalef16_qsu4c32s16s0.c:115-118 - libggml-cpu.so0.000.000.000.000.000.000.0000.000.00
ggml_compute_forward_rms_norm+libggml-cpu.so0.670.120.010.050.010.010.00250.310.01/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/build/llama.cpp/../build/bin/libggml-blas.so (%): 100.00Arm C/C++/Fortran Compiler version 24.10.1 (build number 4) (based on LLVM 19.1.0) /opt/arm/arm-linux-compiler-24.10.1_AmazonLinux-2023/llvm-bin/clang-19 --driver-mode=g++ -D GGML_BACKEND_BUILD -D GGML_BACKEND_SHARED -D GGML_SCHED_MAX_COPIES=4 -D GGML_SHAR...
Loop 1248 - ops.cpp:4319-4365 - libggml-cpu.so [...]+0.000.100.000.070.000.010.0000.000.00
Loop 1250 - vec.h:638-661 - libggml-cpu.so [...]+0.130.100.000.070.010.010.0030.050.00
Loop 1249 - ops.cpp:4325-4326 - libggml-cpu.so0.610.090.090.050.050.000.00160.250.01
Loop 1251 - ops.cpp:4325-4326 - libggml-cpu.so0.000.000.000.000.000.000.0000.000.00
Loop 1252 - vec.h:646-653 - libggml-cpu.so0.130.010.010.010.010.000.0080.050.00
__kmp_barrierlibomp.so0.270.090.090.020.020.010.01590.070.00OMP (%): 100.00
ggml_compute_forward_mul_mat+libggml-cpu.so0.200.090.000.010.010.000.00540.070.00/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/build/llama.cpp/../build/bin/libggml-blas.so (%): 100.00Arm C/C++/Fortran Compiler version 24.10.1 (build number 4) (based on LLVM 19.1.0) /opt/arm/arm-linux-compiler-24.10.1_AmazonLinux-2023/llvm-bin/clang-19 -D GGML_BACKEND_BUILD -D GGML_BACKEND_SHARED -D GGML_SCHED_MAX_COPIES=4 -D GGML_SHARED -D GGML_USE_CPU...
Loop 60 - ggml-cpu.c:1289-1297 - libggml-cpu.so+0.000.000.000.000.000.000.0000.000.00
Loop 59 - ggml-cpu.c:1289-1297 - libggml-cpu.so+0.000.000.000.000.000.000.0000.000.00
Loop 58 - ggml-cpu.c:1289-1297 - libggml-cpu.so0.000.000.000.000.000.000.0000.000.00
Loop 55 - ggml-cpu.c:1125-1395 - libggml-cpu.so [...]+0.070.090.000.040.010.000.0010.000.00
Loop 56 - ggml-cpu.c:1125-1395 - libggml-cpu.so [...]+0.070.090.010.030.010.000.0090.010.00
Loop 54 - ggml-cpu.c:1125-1395 - libggml-cpu.so [...]0.130.020.020.010.010.000.00190.030.00
Loop 53 - ggml-cpu.c:1183-1194 - libggml-cpu.so [...]0.130.050.050.010.010.000.00350.050.00
Loop 57 - ggml-cpu.c:1197-1198 - libggml-cpu.so0.070.000.000.000.000.000.0020.000.00
__kmp_hyper_barrier_gather(barrier_type, kmp_info*, int, int, void (*)(void*, void*), void*)libomp.so0.470.090.090.040.040.000.00480.100.00OMP (%): 100.00
ggml_cpu_fp32_to_fp16+libggml-cpu.so0.340.090.000.020.000.000.00410.120.01/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/build/llama.cpp/../build/bin/libggml-blas.so (%): 100.00Arm C/C++/Fortran Compiler version 24.10.1 (build number 4) (based on LLVM 19.1.0) /opt/arm/arm-linux-compiler-24.10.1_AmazonLinux-2023/llvm-bin/clang-19 -D GGML_BACKEND_BUILD -D GGML_BACKEND_SHARED -D GGML_SCHED_MAX_COPIES=4 -D GGML_SHARED -D GGML_USE_CPU...
Loop 0 - ggml-cpu.c:3228-3229 - libggml-cpu.so [...]0.340.090.090.020.020.000.00410.120.01
Loop 1 - ggml-cpu.c:3228-3229 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
ggml_compute_forward_add_non_quantized+libggml-cpu.so0.610.090.010.050.010.000.00220.230.01/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/build/llama.cpp/../build/bin/libggml-blas.so (%): 100.00Arm C/C++/Fortran Compiler version 24.10.1 (build number 4) (based on LLVM 19.1.0) /opt/arm/arm-linux-compiler-24.10.1_AmazonLinux-2023/llvm-bin/clang-19 --driver-mode=g++ -D GGML_BACKEND_BUILD -D GGML_BACKEND_SHARED -D GGML_SCHED_MAX_COPIES=4 -D GGML_SHAR...
Loop 395 - binary-ops.cpp:10-101 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 394 - binary-ops.cpp:10-101 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 393 - binary-ops.cpp:10-32 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 392 - binary-ops.cpp:84-84 - libggml-cpu.so0.000.000.000.000.000.000.0000.000.00
Loop 391 - binary-ops.cpp:10-32 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 410 - binary-ops.cpp:10-95 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 409 - binary-ops.cpp:10-45 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 417 - binary-ops.cpp:10-95 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 416 - binary-ops.cpp:10-45 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 402 - binary-ops.cpp:10-110 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 400 - binary-ops.cpp:84-101 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 403 - binary-ops.cpp:10-32 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 401 - binary-ops.cpp:10-32 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 380 - binary-ops.cpp:10-110 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 381 - binary-ops.cpp:10-110 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 383 - binary-ops.cpp:10-32 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 384 - binary-ops.cpp:10-32 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 382 - binary-ops.cpp:10-32 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 408 - binary-ops.cpp:10-110 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 407 - binary-ops.cpp:10-110 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 406 - binary-ops.cpp:10-32 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 405 - binary-ops.cpp:84-84 - libggml-cpu.so0.000.000.000.000.000.000.0000.000.00
Loop 404 - binary-ops.cpp:10-32 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 411 - binary-ops.cpp:10-110 - libggml-cpu.so [...]+0.070.080.000.040.010.000.0010.000.00
Loop 412 - binary-ops.cpp:10-110 - libggml-cpu.so [...]+0.000.070.000.040.000.000.0000.000.00
Loop 415 - binary-ops.cpp:10-32 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 414 - binary-ops.cpp:10-32 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 413 - binary-ops.cpp:10-32 - libggml-cpu.so [...]0.540.070.070.040.040.000.00160.220.01
Loop 375 - binary-ops.cpp:10-95 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 374 - binary-ops.cpp:10-45 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 376 - binary-ops.cpp:10-45 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 390 - binary-ops.cpp:10-110 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 387 - binary-ops.cpp:10-110 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 386 - binary-ops.cpp:84-84 - libggml-cpu.so0.000.000.000.000.000.000.0000.000.00
Loop 388 - binary-ops.cpp:10-32 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 389 - binary-ops.cpp:10-32 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 385 - binary-ops.cpp:10-32 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 419 - binary-ops.cpp:10-95 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 418 - binary-ops.cpp:10-45 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 378 - binary-ops.cpp:10-95 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 379 - binary-ops.cpp:10-95 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 377 - binary-ops.cpp:10-45 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 368 - binary-ops.cpp:10-95 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 367 - binary-ops.cpp:10-45 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 370 - binary-ops.cpp:10-95 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 369 - binary-ops.cpp:10-45 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 372 - binary-ops.cpp:10-95 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 373 - binary-ops.cpp:10-95 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 371 - binary-ops.cpp:10-45 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 396 - binary-ops.cpp:10-101 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 398 - binary-ops.cpp:10-101 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 397 - binary-ops.cpp:10-32 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 399 - binary-ops.cpp:10-32 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
__memcpylibastring.so0.540.080.080.040.040.000.00410.120.01String (%): 100.00
ggml_compute_forward_set_rows+libggml-cpu.so0.200.060.050.020.020.000.00390.080.00/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/build/llama.cpp/../build/bin/libggml-blas.so (%): 100.00Arm C/C++/Fortran Compiler version 24.10.1 (build number 4) (based on LLVM 19.1.0) /opt/arm/arm-linux-compiler-24.10.1_AmazonLinux-2023/llvm-bin/clang-19 --driver-mode=g++ -D GGML_BACKEND_BUILD -D GGML_BACKEND_SHARED -D GGML_SCHED_MAX_COPIES=4 -D GGML_SHAR...
Loop 1331 - ops.cpp:5550-5563 - libggml-cpu.so+0.070.010.000.020.000.000.0010.000.00
Loop 1330 - ops.cpp:5551-5563 - libggml-cpu.so+0.000.010.000.010.000.000.0000.000.00
Loop 1329 - ops.cpp:5552-5563 - libggml-cpu.so0.130.010.010.010.010.000.0050.060.00
kmp_flag_native<unsigned long long, (flag_type)1, true>::done_check()libomp.so0.400.060.060.030.030.000.00210.150.01OMP (%): 100.00
__GI___pthread_mutex_unlock_usercntlibc.so.60.470.060.060.040.040.000.00280.160.01Pthread (%): 100.00
ggml_is_emptylibggml-base.so0.270.060.060.020.020.000.00310.090.00/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/build/llama.cpp/../build/bin/libggml.so (%): 100.00Arm C/C++/Fortran Compiler version 24.10.1 (build number 4) (based on LLVM 19.1.0) /opt/arm/arm-linux-compiler-24.10.1_AmazonLinux-2023/llvm-bin/clang-19 -D GGML_BUILD -D GGML_COMMIT="unknown" -D GGML_SCHED_MAX_COPIES=4 -D GGML_SHARED -D GGML_USE_CPU_KLEID...
__GI___lll_lock_wakelibc.so.60.200.050.050.020.020.000.00320.070.00System (%): 100.00
@plt_start@libomp.so0.200.050.050.010.010.000.00320.060.00OMP (%): 100.00
unknown_functionlibggml-cpu.so0.130.040.000.010.000.000.00300.040.00/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/build/llama.cpp/../build/bin/libggml-blas.so (%): 100.00
__kmp_yieldlibomp.so0.200.040.040.010.010.000.00250.060.00OMP (%): 100.00
ggml::cpu::kleidiai::tensor_traits::compute_forward_q4_0(ggml_compute_params*, ggml_tensor*)libggml-cpu.so0.130.030.030.010.010.000.00230.030.00/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/build/llama.cpp/../build/bin/libggml-blas.so (%): 100.00Arm C/C++/Fortran Compiler version 24.10.1 (build number 4) (based on LLVM 19.1.0) /opt/arm/arm-linux-compiler-24.10.1_AmazonLinux-2023/llvm-bin/clang-19 --driver-mode=g++ -D GGML_BACKEND_BUILD -D GGML_BACKEND_SHARED -D GGML_SCHED_MAX_COPIES=4 -D GGML_SHAR...
__kmp_now_nseclibomp.so0.130.030.030.010.010.000.00190.040.00OMP (%): 100.00
__kmpc_barrierlibomp.so0.070.010.010.010.010.000.00100.010.00OMP (%): 100.00
__memsetlibastring.so0.540.010.010.040.040.000.0020.450.02String (%): 100.00
ggml_compute_forward_add+libggml-cpu.so0.130.010.010.010.010.000.0080.040.00/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/build/llama.cpp/../build/bin/libggml-blas.so (%): 100.00Arm C/C++/Fortran Compiler version 24.10.1 (build number 4) (based on LLVM 19.1.0) /opt/arm/arm-linux-compiler-24.10.1_AmazonLinux-2023/llvm-bin/clang-19 --driver-mode=g++ -D GGML_BACKEND_BUILD -D GGML_BACKEND_SHARED -D GGML_SCHED_MAX_COPIES=4 -D GGML_SHAR...
Loop 1059 - vec.h:80-80 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 1063 - vec.h:80-80 - libggml-cpu.so+0.000.000.000.000.000.000.0000.000.00
Loop 1062 - vec.h:80-80 - libggml-cpu.so0.000.000.000.000.000.000.0000.000.00
Loop 1064 - vec.h:80-80 - libggml-cpu.so0.000.000.000.000.000.000.0000.000.00
Loop 1061 - vec.h:80-80 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 1060 - vec.h:80-80 - libggml-cpu.so0.000.000.000.000.000.000.0000.000.00
Loop 1065 - vec.h:80-80 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 1067 - vec.h:80-80 - libggml-cpu.so [...]+0.000.000.000.000.000.000.0000.000.00
Loop 1066 - vec.h:80-80 - libggml-cpu.so0.000.000.000.000.000.000.0000.000.00
Loop 1069 - vec.h:80-80 - libggml-cpu.so+0.000.000.000.000.000.000.0000.000.00
Loop 1068 - vec.h:80-80 - libggml-cpu.so0.000.000.000.000.000.000.0000.000.00
Loop 1070 - vec.h:80-80 - libggml-cpu.so0.000.000.000.000.000.000.0000.000.00
Loop 1072 - ops.cpp:1395-1422 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
Loop 1071 - ops.cpp:1395-1424 - libggml-cpu.so [...]0.000.000.000.000.000.000.0000.000.00
×