* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-47-249.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
[0mOMP: pid 25658 tid 25658 thread 0 bound to OS proc set {0}
OMP: pid 25658 tid 25758 thread 2 bound to OS proc set {48}
OMP: pid 25658 tid 25757 thread 1 bound to OS proc set {24}
OMP: pid 25658 tid 25759 thread 3 bound to OS proc set {72}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 4, "n_threads_batch": 4, "pp": 0, "tg": 128, "pl": 16, "n_kv": 2048, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 43.045486, "speed_tg": 47.577579, "t": 43.045486, "speed": 47.577579}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_2
To display your profiling results:
###########################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
###########################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_2 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_2 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_2 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_2 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_2 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_2 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_2 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_2 #
###########################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-47-249.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
[0mOMP: pid 25829 tid 25829 thread 0 bound to OS proc set {0}
OMP: pid 25829 tid 25929 thread 2 bound to OS proc set {24}
OMP: pid 25829 tid 25931 thread 4 bound to OS proc set {48}
OMP: pid 25829 tid 25932 thread 5 bound to OS proc set {60}
OMP: pid 25829 tid 25933 thread 6 bound to OS proc set {72}
OMP: pid 25829 tid 25928 thread 1 bound to OS proc set {12}
OMP: pid 25829 tid 25930 thread 3 bound to OS proc set {36}
OMP: pid 25829 tid 25934 thread 7 bound to OS proc set {84}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 8, "n_threads_batch": 8, "pp": 0, "tg": 128, "pl": 16, "n_kv": 2048, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 21.909391, "speed_tg": 93.475899, "t": 21.909391, "speed": 93.475899}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_3
To display your profiling results:
###########################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
###########################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_3 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_3 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_3 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_3 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_3 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_3 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_3 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_3 #
###########################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-47-249.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
[0mOMP: pid 25955 tid 25955 thread 0 bound to OS proc set {0}
OMP: pid 25955 tid 26056 thread 3 bound to OS proc set {18}
OMP: pid 25955 tid 26054 thread 1 bound to OS proc set {6}
OMP: pid 25955 tid 26055 thread 2 bound to OS proc set {12}
OMP: pid 25955 tid 26058 thread 5 bound to OS proc set {30}
OMP: pid 25955 tid 26057 thread 4 bound to OS proc set {24}
OMP: pid 25955 tid 26060 thread 7 bound to OS proc set {42}
OMP: pid 25955 tid 26059 thread 6 bound to OS proc set {36}
OMP: pid 25955 tid 26065 thread 12 bound to OS proc set {72}
OMP: pid 25955 tid 26061 thread 8 bound to OS proc set {48}
OMP: pid 25955 tid 26062 thread 9 bound to OS proc set {54}
OMP: pid 25955 tid 26063 thread 10 bound to OS proc set {60}
OMP: pid 25955 tid 26066 thread 13 bound to OS proc set {78}
OMP: pid 25955 tid 26067 thread 14 bound to OS proc set {84}
OMP: pid 25955 tid 26064 thread 11 bound to OS proc set {66}
OMP: pid 25955 tid 26068 thread 15 bound to OS proc set {90}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 16, "n_threads_batch": 16, "pp": 0, "tg": 128, "pl": 16, "n_kv": 2048, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 11.507160, "speed_tg": 177.976151, "t": 11.507160, "speed": 177.976151}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_4
To display your profiling results:
###########################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
###########################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_4 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_4 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_4 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_4 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_4 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_4 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_4 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_4 #
###########################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-47-249.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
[0mOMP: pid 26089 tid 26089 thread 0 bound to OS proc set {0}
OMP: pid 26089 tid 26192 thread 5 bound to OS proc set {20}
OMP: pid 26089 tid 26188 thread 1 bound to OS proc set {4}
OMP: pid 26089 tid 26189 thread 2 bound to OS proc set {8}
OMP: pid 26089 tid 26190 thread 3 bound to OS proc set {12}
OMP: pid 26089 tid 26193 thread 6 bound to OS proc set {24}
OMP: pid 26089 tid 26196 thread 9 bound to OS proc set {36}
OMP: pid 26089 tid 26194 thread 7 bound to OS proc set {28}
OMP: pid 26089 tid 26197 thread 10 bound to OS proc set {40}
OMP: pid 26089 tid 26195 thread 8 bound to OS proc set {32}
OMP: pid 26089 tid 26198 thread 11 bound to OS proc set {44}
OMP: pid 26089 tid 26191 thread 4 bound to OS proc set {16}
OMP: pid 26089 tid 26205 thread 18 bound to OS proc set {72}
OMP: pid 26089 tid 26200 thread 13 bound to OS proc set {52}
OMP: pid 26089 tid 26199 thread 12 bound to OS proc set {48}
OMP: pid 26089 tid 26204 thread 17 bound to OS proc set {68}
OMP: pid 26089 tid 26201 thread 14 bound to OS proc set {56}
OMP: pid 26089 tid 26202 thread 15 bound to OS proc set {60}
OMP: pid 26089 tid 26206 thread 19 bound to OS proc set {76}
OMP: pid 26089 tid 26203 thread 16 bound to OS proc set {64}
OMP: pid 26089 tid 26208 thread 21 bound to OS proc set {84}
OMP: pid 26089 tid 26207 thread 20 bound to OS proc set {80}
OMP: pid 26089 tid 26210 thread 23 bound to OS proc set {92}
OMP: pid 26089 tid 26209 thread 22 bound to OS proc set {88}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 24, "n_threads_batch": 24, "pp": 0, "tg": 128, "pl": 16, "n_kv": 2048, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 8.834927, "speed_tg": 231.807251, "t": 8.834927, "speed": 231.807251}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_5
To display your profiling results:
###########################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
###########################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_5 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_5 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_5 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_5 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_5 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_5 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_5 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_5 #
###########################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-47-249.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
[0mOMP: pid 26279 tid 26279 thread 0 bound to OS proc set {0}
OMP: pid 26279 tid 26378 thread 1 bound to OS proc set {3}
OMP: pid 26279 tid 26379 thread 2 bound to OS proc set {6}
OMP: pid 26279 tid 26385 thread 8 bound to OS proc set {24}
OMP: pid 26279 tid 26380 thread 3 bound to OS proc set {9}
OMP: pid 26279 tid 26386 thread 9 bound to OS proc set {27}
OMP: pid 26279 tid 26387 thread 10 bound to OS proc set {30}
OMP: pid 26279 tid 26384 thread 7 bound to OS proc set {21}
OMP: pid 26279 tid 26381 thread 4 bound to OS proc set {12}
OMP: pid 26279 tid 26382 thread 5 bound to OS proc set {15}
OMP: pid 26279 tid 26390 thread 13 bound to OS proc set {39}
OMP: pid 26279 tid 26388 thread 11 bound to OS proc set {33}
OMP: pid 26279 tid 26383 thread 6 bound to OS proc set {18}
OMP: pid 26279 tid 26391 thread 14 bound to OS proc set {42}
OMP: pid 26279 tid 26389 thread 12 bound to OS proc set {36}
OMP: pid 26279 tid 26394 thread 17 bound to OS proc set {51}
OMP: pid 26279 tid 26392 thread 15 bound to OS proc set {45}
OMP: pid 26279 tid 26396 thread 19 bound to OS proc set {57}
OMP: pid 26279 tid 26395 thread 18 bound to OS proc set {54}
OMP: pid 26279 tid 26393 thread 16 bound to OS proc set {48}
OMP: pid 26279 tid 26397 thread 20 bound to OS proc set {60}
OMP: pid 26279 tid 26400 thread 23 bound to OS proc set {69}
OMP: pid 26279 tid 26405 thread 28 bound to OS proc set {84}
OMP: pid 26279 tid 26406 thread 29 bound to OS proc set {87}
OMP: pid 26279 tid 26398 thread 21 bound to OS proc set {63}
OMP: pid 26279 tid 26401 thread 24 bound to OS proc set {72}
OMP: pid 26279 tid 26407 thread 30 bound to OS proc set {90}
OMP: pid 26279 tid 26402 thread 25 bound to OS proc set {75}
OMP: pid 26279 tid 26399 thread 22 bound to OS proc set {66}
OMP: pid 26279 tid 26403 thread 26 bound to OS proc set {78}
OMP: pid 26279 tid 26408 thread 31 bound to OS proc set {93}
OMP: pid 26279 tid 26404 thread 27 bound to OS proc set {81}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 32, "n_threads_batch": 32, "pp": 0, "tg": 128, "pl": 16, "n_kv": 2048, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 7.446552, "speed_tg": 275.026611, "t": 7.446552, "speed": 275.026611}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_6
To display your profiling results:
###########################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
###########################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_6 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_6 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_6 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_6 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_6 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_6 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_6 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_6 #
###########################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-47-249.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
[0mOMP: pid 26429 tid 26429 thread 0 bound to OS proc set {0}
OMP: pid 26429 tid 26528 thread 1 bound to OS proc set {2}
OMP: pid 26429 tid 26530 thread 3 bound to OS proc set {7}
OMP: pid 26429 tid 26529 thread 2 bound to OS proc set {4}
OMP: pid 26429 tid 26531 thread 4 bound to OS proc set {9}
OMP: pid 26429 tid 26540 thread 13 bound to OS proc set {31}
OMP: pid 26429 tid 26537 thread 10 bound to OS proc set {24}
OMP: pid 26429 tid 26536 thread 9 bound to OS proc set {21}
OMP: pid 26429 tid 26532 thread 5 bound to OS proc set {12}
OMP: pid 26429 tid 26538 thread 11 bound to OS proc set {26}
OMP: pid 26429 tid 26533 thread 6 bound to OS proc set {14}
OMP: pid 26429 tid 26539 thread 12 bound to OS proc set {29}
OMP: pid 26429 tid 26535 thread 8 bound to OS proc set {19}
OMP: pid 26429 tid 26544 thread 17 bound to OS proc set {41}
OMP: pid 26429 tid 26545 thread 18 bound to OS proc set {43}
OMP: pid 26429 tid 26534 thread 7 bound to OS proc set {16}
OMP: pid 26429 tid 26541 thread 14 bound to OS proc set {33}
OMP: pid 26429 tid 26546 thread 19 bound to OS proc set {46}
OMP: pid 26429 tid 26559 thread 32 bound to OS proc set {77}
OMP: pid 26429 tid 26561 thread 34 bound to OS proc set {82}
OMP: pid 26429 tid 26560 thread 33 bound to OS proc set {80}
OMP: pid 26429 tid 26547 thread 20 bound to OS proc set {48}
OMP: pid 26429 tid 26562 thread 35 bound to OS proc set {84}
OMP: pid 26429 tid 26542 thread 15 bound to OS proc set {36}
OMP: pid 26429 tid 26557 thread 30 bound to OS proc set {72}
OMP: pid 26429 tid 26550 thread 23 bound to OS proc set {55}
OMP: pid 26429 tid 26552 thread 25 bound to OS proc set {60}
OMP: pid 26429 tid 26548 thread 21 bound to OS proc set {50}
OMP: pid 26429 tid 26551 thread 24 bound to OS proc set {58}
OMP: pid 26429 tid 26543 thread 16 bound to OS proc set {38}
OMP: pid 26429 tid 26556 thread 29 bound to OS proc set {70}
OMP: pid 26429 tid 26564 thread 37 bound to OS proc set {89}
OMP: pid 26429 tid 26563 thread 36 bound to OS proc set {87}
OMP: pid 26429 tid 26558 thread 31 bound to OS proc set {75}
OMP: pid 26429 tid 26555 thread 28 bound to OS proc set {67}
OMP: pid 26429 tid 26565 thread 38 bound to OS proc set {92}
OMP: pid 26429 tid 26553 thread 26 bound to OS proc set {63}
OMP: pid 26429 tid 26566 thread 39 bound to OS proc set {94}
OMP: pid 26429 tid 26549 thread 22 bound to OS proc set {53}
OMP: pid 26429 tid 26554 thread 27 bound to OS proc set {65}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 40, "n_threads_batch": 40, "pp": 0, "tg": 128, "pl": 16, "n_kv": 2048, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 6.473384, "speed_tg": 316.372406, "t": 6.473384, "speed": 316.372406}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_7
To display your profiling results:
###########################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
###########################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_7 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_7 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_7 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_7 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_7 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_7 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_7 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_7 #
###########################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-47-249.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
[0mOMP: pid 26589 tid 26589 thread 0 bound to OS proc set {0}
OMP: pid 26589 tid 26690 thread 1 bound to OS proc set {2}
OMP: pid 26589 tid 26692 thread 3 bound to OS proc set {6}
OMP: pid 26589 tid 26691 thread 2 bound to OS proc set {4}
OMP: pid 26589 tid 26702 thread 13 bound to OS proc set {26}
OMP: pid 26589 tid 26693 thread 4 bound to OS proc set {8}
OMP: pid 26589 tid 26694 thread 5 bound to OS proc set {10}
OMP: pid 26589 tid 26701 thread 12 bound to OS proc set {24}
OMP: pid 26589 tid 26700 thread 11 bound to OS proc set {22}
OMP: pid 26589 tid 26704 thread 15 bound to OS proc set {30}
OMP: pid 26589 tid 26698 thread 9 bound to OS proc set {18}
OMP: pid 26589 tid 26696 thread 7 bound to OS proc set {14}
OMP: pid 26589 tid 26703 thread 14 bound to OS proc set {28}
OMP: pid 26589 tid 26708 thread 19 bound to OS proc set {38}
OMP: pid 26589 tid 26695 thread 6 bound to OS proc set {12}
OMP: pid 26589 tid 26697 thread 8 bound to OS proc set {16}
OMP: pid 26589 tid 26706 thread 17 bound to OS proc set {34}
OMP: pid 26589 tid 26705 thread 16 bound to OS proc set {32}
OMP: pid 26589 tid 26723 thread 34 bound to OS proc set {68}
OMP: pid 26589 tid 26699 thread 10 bound to OS proc set {20}
OMP: pid 26589 tid 26724 thread 35 bound to OS proc set {70}
OMP: pid 26589 tid 26721 thread 32 bound to OS proc set {64}
OMP: pid 26589 tid 26707 thread 18 bound to OS proc set {36}
OMP: pid 26589 tid 26709 thread 20 bound to OS proc set {40}
OMP: pid 26589 tid 26722 thread 33 bound to OS proc set {66}
OMP: pid 26589 tid 26710 thread 21 bound to OS proc set {42}
OMP: pid 26589 tid 26726 thread 37 bound to OS proc set {74}
OMP: pid 26589 tid 26730 thread 41 bound to OS proc set {82}
OMP: pid 26589 tid 26711 thread 22 bound to OS proc set {44}
OMP: pid 26589 tid 26725 thread 36 bound to OS proc set {72}
OMP: pid 26589 tid 26729 thread 40 bound to OS proc set {80}
OMP: pid 26589 tid 26712 thread 23 bound to OS proc set {46}
OMP: pid 26589 tid 26719 thread 30 bound to OS proc set {60}
OMP: pid 26589 tid 26734 thread 45 bound to OS proc set {90}
OMP: pid 26589 tid 26718 thread 29 bound to OS proc set {58}
OMP: pid 26589 tid 26731 thread 42 bound to OS proc set {84}
OMP: pid 26589 tid 26714 thread 25 bound to OS proc set {50}
OMP: pid 26589 tid 26727 thread 38 bound to OS proc set {76}
OMP: pid 26589 tid 26732 thread 43 bound to OS proc set {86}
OMP: pid 26589 tid 26735 thread 46 bound to OS proc set {92}
OMP: pid 26589 tid 26733 thread 44 bound to OS proc set {88}
OMP: pid 26589 tid 26713 thread 24 bound to OS proc set {48}
OMP: pid 26589 tid 26717 thread 28 bound to OS proc set {56}
OMP: pid 26589 tid 26736 thread 47 bound to OS proc set {94}
OMP: pid 26589 tid 26728 thread 39 bound to OS proc set {78}
OMP: pid 26589 tid 26715 thread 26 bound to OS proc set {52}
OMP: pid 26589 tid 26716 thread 27 bound to OS proc set {54}
OMP: pid 26589 tid 26720 thread 31 bound to OS proc set {62}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 48, "n_threads_batch": 48, "pp": 0, "tg": 128, "pl": 16, "n_kv": 2048, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 5.956303, "speed_tg": 343.837433, "t": 5.956303, "speed": 343.837433}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_8
To display your profiling results:
###########################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
###########################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_8 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_8 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_8 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_8 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_8 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_8 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_8 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_8 #
###########################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-47-249.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
[0mOMP: pid 26807 tid 26906 thread 1 bound to OS proc set {1}
OMP: pid 26807 tid 26807 thread 0 bound to OS proc set {0}
OMP: pid 26807 tid 26907 thread 2 bound to OS proc set {3}
OMP: pid 26807 tid 26908 thread 3 bound to OS proc set {5}
OMP: pid 26807 tid 26910 thread 5 bound to OS proc set {8}
OMP: pid 26807 tid 26911 thread 6 bound to OS proc set {10}
OMP: pid 26807 tid 26918 thread 13 bound to OS proc set {22}
OMP: pid 26807 tid 26909 thread 4 bound to OS proc set {6}
OMP: pid 26807 tid 26912 thread 7 bound to OS proc set {12}
OMP: pid 26807 tid 26913 thread 8 bound to OS proc set {13}
OMP: pid 26807 tid 26919 thread 14 bound to OS proc set {24}
OMP: pid 26807 tid 26922 thread 17 bound to OS proc set {29}
OMP: pid 26807 tid 26915 thread 10 bound to OS proc set {17}
OMP: pid 26807 tid 26923 thread 18 bound to OS proc set {31}
OMP: pid 26807 tid 26916 thread 11 bound to OS proc set {19}
OMP: pid 26807 tid 26914 thread 9 bound to OS proc set {15}
OMP: pid 26807 tid 26921 thread 16 bound to OS proc set {27}
OMP: pid 26807 tid 26924 thread 19 bound to OS proc set {32}
OMP: pid 26807 tid 26938 thread 33 bound to OS proc set {57}
OMP: pid 26807 tid 26939 thread 34 bound to OS proc set {58}
OMP: pid 26807 tid 26955 thread 50 bound to OS proc set {86}
OMP: pid 26807 tid 26954 thread 49 bound to OS proc set {84}
OMP: pid 26807 tid 26940 thread 35 bound to OS proc set {60}
OMP: pid 26807 tid 26956 thread 51 bound to OS proc set {88}
OMP: pid 26807 tid 26953 thread 48 bound to OS proc set {83}
OMP: pid 26807 tid 26925 thread 20 bound to OS proc set {34}
OMP: pid 26807 tid 26917 thread 12 bound to OS proc set {20}
OMP: pid 26807 tid 26926 thread 21 bound to OS proc set {36}
OMP: pid 26807 tid 26933 thread 28 bound to OS proc set {48}
OMP: pid 26807 tid 26927 thread 22 bound to OS proc set {38}
OMP: pid 26807 tid 26920 thread 15 bound to OS proc set {25}
OMP: pid 26807 tid 26942 thread 37 bound to OS proc set {64}
OMP: pid 26807 tid 26950 thread 45 bound to OS proc set {77}
OMP: pid 26807 tid 26929 thread 24 bound to OS proc set {41}
OMP: pid 26807 tid 26946 thread 41 bound to OS proc set {71}
OMP: pid 26807 tid 26937 thread 32 bound to OS proc set {55}
OMP: pid 26807 tid 26943 thread 38 bound to OS proc set {65}
OMP: pid 26807 tid 26932 thread 27 bound to OS proc set {46}
OMP: pid 26807 tid 26930 thread 25 bound to OS proc set {43}
OMP: pid 26807 tid 26936 thread 31 bound to OS proc set {53}
OMP: pid 26807 tid 26931 thread 26 bound to OS proc set {45}
OMP: pid 26807 tid 26934 thread 29 bound to OS proc set {50}
OMP: pid 26807 tid 26928 thread 23 bound to OS proc set {39}
OMP: pid 26807 tid 26949 thread 44 bound to OS proc set {76}
OMP: pid 26807 tid 26944 thread 39 bound to OS proc set {67}
OMP: pid 26807 tid 26935 thread 30 bound to OS proc set {51}
OMP: pid 26807 tid 26941 thread 36 bound to OS proc set {62}
OMP: pid 26807 tid 26957 thread 52 bound to OS proc set {90}
OMP: pid 26807 tid 26948 thread 43 bound to OS proc set {74}
OMP: pid 26807 tid 26947 thread 42 bound to OS proc set {72}
OMP: pid 26807 tid 26951 thread 46 bound to OS proc set {79}
OMP: pid 26807 tid 26958 thread 53 bound to OS proc set {91}
OMP: pid 26807 tid 26945 thread 40 bound to OS proc set {69}
OMP: pid 26807 tid 26952 thread 47 bound to OS proc set {81}
OMP: pid 26807 tid 26959 thread 54 bound to OS proc set {93}
OMP: pid 26807 tid 26960 thread 55 bound to OS proc set {95}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 56, "n_threads_batch": 56, "pp": 0, "tg": 128, "pl": 16, "n_kv": 2048, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 5.573398, "speed_tg": 367.459839, "t": 5.573398, "speed": 367.459839}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_9
To display your profiling results:
###########################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
###########################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_9 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_9 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_9 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_9 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_9 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_9 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_9 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_9 #
###########################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-47-249.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
[0mOMP: pid 26981 tid 27080 thread 1 bound to OS proc set {1}
OMP: pid 26981 tid 26981 thread 0 bound to OS proc set {0}
OMP: pid 26981 tid 27081 thread 2 bound to OS proc set {3}
OMP: pid 26981 tid 27082 thread 3 bound to OS proc set {4}
OMP: pid 26981 tid 27085 thread 6 bound to OS proc set {9}
OMP: pid 26981 tid 27083 thread 4 bound to OS proc set {6}
OMP: pid 26981 tid 27087 thread 8 bound to OS proc set {12}
OMP: pid 26981 tid 27088 thread 9 bound to OS proc set {13}
OMP: pid 26981 tid 27086 thread 7 bound to OS proc set {10}
OMP: pid 26981 tid 27084 thread 5 bound to OS proc set {7}
OMP: pid 26981 tid 27096 thread 17 bound to OS proc set {25}
OMP: pid 26981 tid 27093 thread 14 bound to OS proc set {21}
OMP: pid 26981 tid 27089 thread 10 bound to OS proc set {15}
OMP: pid 26981 tid 27091 thread 12 bound to OS proc set {18}
OMP: pid 26981 tid 27097 thread 18 bound to OS proc set {27}
OMP: pid 26981 tid 27092 thread 13 bound to OS proc set {19}
OMP: pid 26981 tid 27090 thread 11 bound to OS proc set {16}
OMP: pid 26981 tid 27114 thread 35 bound to OS proc set {53}
OMP: pid 26981 tid 27112 thread 33 bound to OS proc set {50}
OMP: pid 26981 tid 27128 thread 49 bound to OS proc set {74}
OMP: pid 26981 tid 27101 thread 22 bound to OS proc set {33}
OMP: pid 26981 tid 27113 thread 34 bound to OS proc set {51}
OMP: pid 26981 tid 27104 thread 25 bound to OS proc set {37}
OMP: pid 26981 tid 27095 thread 16 bound to OS proc set {24}
OMP: pid 26981 tid 27103 thread 24 bound to OS proc set {36}
OMP: pid 26981 tid 27094 thread 15 bound to OS proc set {22}
OMP: pid 26981 tid 27129 thread 50 bound to OS proc set {75}
OMP: pid 26981 tid 27102 thread 23 bound to OS proc set {34}
OMP: pid 26981 tid 27098 thread 19 bound to OS proc set {28}
OMP: pid 26981 tid 27099 thread 20 bound to OS proc set {30}
OMP: pid 26981 tid 27111 thread 32 bound to OS proc set {48}
OMP: pid 26981 tid 27130 thread 51 bound to OS proc set {77}
OMP: pid 26981 tid 27109 thread 30 bound to OS proc set {45}
OMP: pid 26981 tid 27120 thread 41 bound to OS proc set {62}
OMP: pid 26981 tid 27107 thread 28 bound to OS proc set {42}
OMP: pid 26981 tid 27116 thread 37 bound to OS proc set {56}
OMP: pid 26981 tid 27118 thread 39 bound to OS proc set {59}
OMP: pid 26981 tid 27121 thread 42 bound to OS proc set {63}
OMP: pid 26981 tid 27127 thread 48 bound to OS proc set {72}
OMP: pid 26981 tid 27108 thread 29 bound to OS proc set {43}
OMP: pid 26981 tid 27124 thread 45 bound to OS proc set {68}
OMP: pid 26981 tid 27100 thread 21 bound to OS proc set {31}
OMP: pid 26981 tid 27126 thread 47 bound to OS proc set {71}
OMP: pid 26981 tid 27115 thread 36 bound to OS proc set {54}
OMP: pid 26981 tid 27110 thread 31 bound to OS proc set {46}
OMP: pid 26981 tid 27125 thread 46 bound to OS proc set {69}
OMP: pid 26981 tid 27105 thread 26 bound to OS proc set {39}
OMP: pid 26981 tid 27122 thread 43 bound to OS proc set {65}
OMP: pid 26981 tid 27131 thread 52 bound to OS proc set {78}
OMP: pid 26981 tid 27106 thread 27 bound to OS proc set {40}
OMP: pid 26981 tid 27117 thread 38 bound to OS proc set {57}
OMP: pid 26981 tid 27119 thread 40 bound to OS proc set {60}
OMP: pid 26981 tid 27134 thread 55 bound to OS proc set {83}
OMP: pid 26981 tid 27133 thread 54 bound to OS proc set {81}
OMP: pid 26981 tid 27135 thread 56 bound to OS proc set {84}
OMP: pid 26981 tid 27132 thread 53 bound to OS proc set {80}
OMP: pid 26981 tid 27136 thread 57 bound to OS proc set {86}
OMP: pid 26981 tid 27123 thread 44 bound to OS proc set {66}
OMP: pid 26981 tid 27140 thread 61 bound to OS proc set {92}
OMP: pid 26981 tid 27139 thread 60 bound to OS proc set {90}
OMP: pid 26981 tid 27137 thread 58 bound to OS proc set {87}
OMP: pid 26981 tid 27142 thread 63 bound to OS proc set {95}
OMP: pid 26981 tid 27141 thread 62 bound to OS proc set {93}
OMP: pid 26981 tid 27138 thread 59 bound to OS proc set {89}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 64, "n_threads_batch": 64, "pp": 0, "tg": 128, "pl": 16, "n_kv": 2048, "t_pp": 0.000001, "speed_pp": 0.000000, "t_tg": 5.264583, "speed_tg": 389.014648, "t": 5.264584, "speed": 389.014587}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_10
To display your profiling results:
############################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
############################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_10 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_10 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_10 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_10 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_10 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_10 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_10 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_10 #
############################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-47-249.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
[0mOMP: pid 27163 tid 27263 thread 2 bound to OS proc set {2}
OMP: pid 27163 tid 27262 thread 1 bound to OS proc set {1}
OMP: pid 27163 tid 27163 thread 0 bound to OS proc set {0}
OMP: pid 27163 tid 27264 thread 3 bound to OS proc set {4}
OMP: pid 27163 tid 27267 thread 6 bound to OS proc set {8}
OMP: pid 27163 tid 27266 thread 5 bound to OS proc set {6}
OMP: pid 27163 tid 27268 thread 7 bound to OS proc set {9}
OMP: pid 27163 tid 27265 thread 4 bound to OS proc set {5}
OMP: pid 27163 tid 27269 thread 8 bound to OS proc set {10}
OMP: pid 27163 tid 27271 thread 10 bound to OS proc set {13}
OMP: pid 27163 tid 27270 thread 9 bound to OS proc set {12}
OMP: pid 27163 tid 27272 thread 11 bound to OS proc set {14}
OMP: pid 27163 tid 27274 thread 13 bound to OS proc set {17}
OMP: pid 27163 tid 27278 thread 17 bound to OS proc set {22}
OMP: pid 27163 tid 27294 thread 33 bound to OS proc set {44}
OMP: pid 27163 tid 27310 thread 49 bound to OS proc set {66}
OMP: pid 27163 tid 27295 thread 34 bound to OS proc set {45}
OMP: pid 27163 tid 27275 thread 14 bound to OS proc set {18}
OMP: pid 27163 tid 27273 thread 12 bound to OS proc set {16}
OMP: pid 27163 tid 27311 thread 50 bound to OS proc set {67}
OMP: pid 27163 tid 27279 thread 18 bound to OS proc set {24}
OMP: pid 27163 tid 27296 thread 35 bound to OS proc set {47}
OMP: pid 27163 tid 27277 thread 16 bound to OS proc set {21}
OMP: pid 27163 tid 27280 thread 19 bound to OS proc set {25}
OMP: pid 27163 tid 27312 thread 51 bound to OS proc set {68}
OMP: pid 27163 tid 27326 thread 65 bound to OS proc set {87}
OMP: pid 27163 tid 27290 thread 29 bound to OS proc set {39}
OMP: pid 27163 tid 27276 thread 15 bound to OS proc set {20}
OMP: pid 27163 tid 27289 thread 28 bound to OS proc set {37}
OMP: pid 27163 tid 27288 thread 27 bound to OS proc set {36}
OMP: pid 27163 tid 27293 thread 32 bound to OS proc set {43}
OMP: pid 27163 tid 27327 thread 66 bound to OS proc set {88}
OMP: pid 27163 tid 27282 thread 21 bound to OS proc set {28}
OMP: pid 27163 tid 27287 thread 26 bound to OS proc set {35}
OMP: pid 27163 tid 27328 thread 67 bound to OS proc set {90}
OMP: pid 27163 tid 27285 thread 24 bound to OS proc set {32}
OMP: pid 27163 tid 27300 thread 39 bound to OS proc set {52}
OMP: pid 27163 tid 27325 thread 64 bound to OS proc set {86}
OMP: pid 27163 tid 27292 thread 31 bound to OS proc set {41}
OMP: pid 27163 tid 27286 thread 25 bound to OS proc set {33}
OMP: pid 27163 tid 27298 thread 37 bound to OS proc set {49}
OMP: pid 27163 tid 27302 thread 41 bound to OS proc set {55}
OMP: pid 27163 tid 27291 thread 30 bound to OS proc set {40}
OMP: pid 27163 tid 27305 thread 44 bound to OS proc set {59}
OMP: pid 27163 tid 27284 thread 23 bound to OS proc set {30}
OMP: pid 27163 tid 27283 thread 22 bound to OS proc set {29}
OMP: pid 27163 tid 27316 thread 55 bound to OS proc set {74}
OMP: pid 27163 tid 27281 thread 20 bound to OS proc set {26}
OMP: pid 27163 tid 27315 thread 54 bound to OS proc set {72}
OMP: pid 27163 tid 27314 thread 53 bound to OS proc set {71}
OMP: pid 27163 tid 27301 thread 40 bound to OS proc set {53}
OMP: pid 27163 tid 27309 thread 48 bound to OS proc set {64}
OMP: pid 27163 tid 27297 thread 36 bound to OS proc set {48}
OMP: pid 27163 tid 27303 thread 42 bound to OS proc set {56}
OMP: pid 27163 tid 27304 thread 43 bound to OS proc set {57}
OMP: pid 27163 tid 27306 thread 45 bound to OS proc set {60}
OMP: pid 27163 tid 27330 thread 69 bound to OS proc set {92}
OMP: pid 27163 tid 27299 thread 38 bound to OS proc set {51}
OMP: pid 27163 tid 27307 thread 46 bound to OS proc set {61}
OMP: pid 27163 tid 27322 thread 61 bound to OS proc set {82}
OMP: pid 27163 tid 27329 thread 68 bound to OS proc set {91}
OMP: pid 27163 tid 27321 thread 60 bound to OS proc set {80}
OMP: pid 27163 tid 27319 thread 58 bound to OS proc set {78}
OMP: pid 27163 tid 27313 thread 52 bound to OS proc set {70}
OMP: pid 27163 tid 27320 thread 59 bound to OS proc set {79}
OMP: pid 27163 tid 27308 thread 47 bound to OS proc set {63}
OMP: pid 27163 tid 27324 thread 63 bound to OS proc set {84}
OMP: pid 27163 tid 27317 thread 56 bound to OS proc set {75}
OMP: pid 27163 tid 27331 thread 70 bound to OS proc set {94}
OMP: pid 27163 tid 27318 thread 57 bound to OS proc set {76}
OMP: pid 27163 tid 27332 thread 71 bound to OS proc set {95}
OMP: pid 27163 tid 27323 thread 62 bound to OS proc set {83}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 72, "n_threads_batch": 72, "pp": 0, "tg": 128, "pl": 16, "n_kv": 2048, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 5.085026, "speed_tg": 402.751160, "t": 5.085026, "speed": 402.751160}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_11
To display your profiling results:
############################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
############################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_11 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_11 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_11 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_11 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_11 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_11 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_11 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_11 #
############################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-47-249.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
[0mOMP: pid 27353 tid 27453 thread 2 bound to OS proc set {2}
OMP: pid 27353 tid 27454 thread 3 bound to OS proc set {3}
OMP: pid 27353 tid 27452 thread 1 bound to OS proc set {1}
OMP: pid 27353 tid 27455 thread 4 bound to OS proc set {4}
OMP: pid 27353 tid 27353 thread 0 bound to OS proc set {0}
OMP: pid 27353 tid 27456 thread 5 bound to OS proc set {6}
OMP: pid 27353 tid 27458 thread 7 bound to OS proc set {8}
OMP: pid 27353 tid 27459 thread 8 bound to OS proc set {9}
OMP: pid 27353 tid 27457 thread 6 bound to OS proc set {7}
OMP: pid 27353 tid 27460 thread 9 bound to OS proc set {10}
OMP: pid 27353 tid 27463 thread 12 bound to OS proc set {14}
OMP: pid 27353 tid 27461 thread 10 bound to OS proc set {12}
OMP: pid 27353 tid 27462 thread 11 bound to OS proc set {13}
OMP: pid 27353 tid 27464 thread 13 bound to OS proc set {15}
OMP: pid 27353 tid 27466 thread 15 bound to OS proc set {18}
OMP: pid 27353 tid 27484 thread 33 bound to OS proc set {40}
OMP: pid 27353 tid 27500 thread 49 bound to OS proc set {59}
OMP: pid 27353 tid 27485 thread 34 bound to OS proc set {41}
OMP: pid 27353 tid 27470 thread 19 bound to OS proc set {23}
OMP: pid 27353 tid 27465 thread 14 bound to OS proc set {16}
OMP: pid 27353 tid 27467 thread 16 bound to OS proc set {19}
OMP: pid 27353 tid 27501 thread 50 bound to OS proc set {60}
OMP: pid 27353 tid 27469 thread 18 bound to OS proc set {21}
OMP: pid 27353 tid 27517 thread 66 bound to OS proc set {80}
OMP: pid 27353 tid 27516 thread 65 bound to OS proc set {78}
OMP: pid 27353 tid 27468 thread 17 bound to OS proc set {20}
OMP: pid 27353 tid 27472 thread 21 bound to OS proc set {25}
OMP: pid 27353 tid 27518 thread 67 bound to OS proc set {81}
OMP: pid 27353 tid 27473 thread 22 bound to OS proc set {26}
OMP: pid 27353 tid 27480 thread 29 bound to OS proc set {35}
OMP: pid 27353 tid 27471 thread 20 bound to OS proc set {24}
OMP: pid 27353 tid 27475 thread 24 bound to OS proc set {29}
OMP: pid 27353 tid 27487 thread 36 bound to OS proc set {43}
OMP: pid 27353 tid 27515 thread 64 bound to OS proc set {77}
OMP: pid 27353 tid 27486 thread 35 bound to OS proc set {42}
OMP: pid 27353 tid 27476 thread 25 bound to OS proc set {30}
OMP: pid 27353 tid 27481 thread 30 bound to OS proc set {36}
OMP: pid 27353 tid 27483 thread 32 bound to OS proc set {38}
OMP: pid 27353 tid 27479 thread 28 bound to OS proc set {33}
OMP: pid 27353 tid 27492 thread 41 bound to OS proc set {49}
OMP: pid 27353 tid 27482 thread 31 bound to OS proc set {37}
OMP: pid 27353 tid 27477 thread 26 bound to OS proc set {31}
OMP: pid 27353 tid 27499 thread 48 bound to OS proc set {58}
OMP: pid 27353 tid 27496 thread 45 bound to OS proc set {54}
OMP: pid 27353 tid 27493 thread 42 bound to OS proc set {50}
OMP: pid 27353 tid 27504 thread 53 bound to OS proc set {64}
OMP: pid 27353 tid 27508 thread 57 bound to OS proc set {69}
OMP: pid 27353 tid 27495 thread 44 bound to OS proc set {53}
OMP: pid 27353 tid 27491 thread 40 bound to OS proc set {48}
OMP: pid 27353 tid 27478 thread 27 bound to OS proc set {32}
OMP: pid 27353 tid 27498 thread 47 bound to OS proc set {56}
OMP: pid 27353 tid 27506 thread 55 bound to OS proc set {66}
OMP: pid 27353 tid 27494 thread 43 bound to OS proc set {52}
OMP: pid 27353 tid 27488 thread 37 bound to OS proc set {44}
OMP: pid 27353 tid 27507 thread 56 bound to OS proc set {67}
OMP: pid 27353 tid 27489 thread 38 bound to OS proc set {46}
OMP: pid 27353 tid 27497 thread 46 bound to OS proc set {55}
OMP: pid 27353 tid 27474 thread 23 bound to OS proc set {27}
OMP: pid 27353 tid 27490 thread 39 bound to OS proc set {47}
OMP: pid 27353 tid 27511 thread 60 bound to OS proc set {72}
OMP: pid 27353 tid 27502 thread 51 bound to OS proc set {61}
OMP: pid 27353 tid 27524 thread 73 bound to OS proc set {88}
OMP: pid 27353 tid 27503 thread 52 bound to OS proc set {63}
OMP: pid 27353 tid 27509 thread 58 bound to OS proc set {70}
OMP: pid 27353 tid 27521 thread 70 bound to OS proc set {84}
OMP: pid 27353 tid 27523 thread 72 bound to OS proc set {87}
OMP: pid 27353 tid 27510 thread 59 bound to OS proc set {71}
OMP: pid 27353 tid 27525 thread 74 bound to OS proc set {89}
OMP: pid 27353 tid 27512 thread 61 bound to OS proc set {73}
OMP: pid 27353 tid 27505 thread 54 bound to OS proc set {65}
OMP: pid 27353 tid 27522 thread 71 bound to OS proc set {86}
OMP: pid 27353 tid 27526 thread 75 bound to OS proc set {90}
OMP: pid 27353 tid 27527 thread 76 bound to OS proc set {92}
OMP: pid 27353 tid 27513 thread 62 bound to OS proc set {75}
OMP: pid 27353 tid 27514 thread 63 bound to OS proc set {76}
OMP: pid 27353 tid 27519 thread 68 bound to OS proc set {82}
OMP: pid 27353 tid 27528 thread 77 bound to OS proc set {93}
OMP: pid 27353 tid 27529 thread 78 bound to OS proc set {94}
OMP: pid 27353 tid 27520 thread 69 bound to OS proc set {83}
OMP: pid 27353 tid 27530 thread 79 bound to OS proc set {95}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 80, "n_threads_batch": 80, "pp": 0, "tg": 128, "pl": 16, "n_kv": 2048, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 5.027599, "speed_tg": 407.351501, "t": 5.027599, "speed": 407.351501}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_12
To display your profiling results:
############################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
############################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_12 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_12 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_12 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_12 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_12 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_12 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_12 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_12 #
############################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-47-249.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
[0mOMP: pid 27599 tid 27699 thread 2 bound to OS proc set {2}
OMP: pid 27599 tid 27700 thread 3 bound to OS proc set {3}
OMP: pid 27599 tid 27698 thread 1 bound to OS proc set {1}
OMP: pid 27599 tid 27702 thread 5 bound to OS proc set {5}
OMP: pid 27599 tid 27701 thread 4 bound to OS proc set {4}
OMP: pid 27599 tid 27703 thread 6 bound to OS proc set {6}
OMP: pid 27599 tid 27706 thread 9 bound to OS proc set {9}
OMP: pid 27599 tid 27599 thread 0 bound to OS proc set {0}
OMP: pid 27599 tid 27705 thread 8 bound to OS proc set {8}
OMP: pid 27599 tid 27704 thread 7 bound to OS proc set {7}
OMP: pid 27599 tid 27714 thread 17 bound to OS proc set {18}
OMP: pid 27599 tid 27712 thread 15 bound to OS proc set {16}
OMP: pid 27599 tid 27730 thread 33 bound to OS proc set {36}
OMP: pid 27599 tid 27715 thread 18 bound to OS proc set {19}
OMP: pid 27599 tid 27731 thread 34 bound to OS proc set {37}
OMP: pid 27599 tid 27709 thread 12 bound to OS proc set {13}
OMP: pid 27599 tid 27716 thread 19 bound to OS proc set {20}
OMP: pid 27599 tid 27732 thread 35 bound to OS proc set {38}
OMP: pid 27599 tid 27713 thread 16 bound to OS proc set {17}
OMP: pid 27599 tid 27710 thread 13 bound to OS proc set {14}
OMP: pid 27599 tid 27711 thread 14 bound to OS proc set {15}
OMP: pid 27599 tid 27708 thread 11 bound to OS proc set {12}
OMP: pid 27599 tid 27707 thread 10 bound to OS proc set {11}
OMP: pid 27599 tid 27729 thread 32 bound to OS proc set {35}
OMP: pid 27599 tid 27734 thread 37 bound to OS proc set {40}
OMP: pid 27599 tid 27746 thread 49 bound to OS proc set {54}
OMP: pid 27599 tid 27722 thread 25 bound to OS proc set {27}
OMP: pid 27599 tid 27721 thread 24 bound to OS proc set {26}
OMP: pid 27599 tid 27725 thread 28 bound to OS proc set {30}
OMP: pid 27599 tid 27733 thread 36 bound to OS proc set {39}
OMP: pid 27599 tid 27724 thread 27 bound to OS proc set {29}
OMP: pid 27599 tid 27727 thread 30 bound to OS proc set {33}
OMP: pid 27599 tid 27745 thread 48 bound to OS proc set {52}
OMP: pid 27599 tid 27735 thread 38 bound to OS proc set {41}
OMP: pid 27599 tid 27736 thread 39 bound to OS proc set {42}
OMP: pid 27599 tid 27762 thread 65 bound to OS proc set {71}
OMP: pid 27599 tid 27728 thread 31 bound to OS proc set {34}
OMP: pid 27599 tid 27717 thread 20 bound to OS proc set {22}
OMP: pid 27599 tid 27718 thread 21 bound to OS proc set {23}
OMP: pid 27599 tid 27719 thread 22 bound to OS proc set {24}
OMP: pid 27599 tid 27737 thread 40 bound to OS proc set {44}
OMP: pid 27599 tid 27726 thread 29 bound to OS proc set {31}
OMP: pid 27599 tid 27738 thread 41 bound to OS proc set {45}
OMP: pid 27599 tid 27747 thread 50 bound to OS proc set {55}
OMP: pid 27599 tid 27723 thread 26 bound to OS proc set {28}
OMP: pid 27599 tid 27741 thread 44 bound to OS proc set {48}
OMP: pid 27599 tid 27748 thread 51 bound to OS proc set {56}
OMP: pid 27599 tid 27740 thread 43 bound to OS proc set {47}
OMP: pid 27599 tid 27749 thread 52 bound to OS proc set {57}
OMP: pid 27599 tid 27720 thread 23 bound to OS proc set {25}
OMP: pid 27599 tid 27739 thread 42 bound to OS proc set {46}
OMP: pid 27599 tid 27742 thread 45 bound to OS proc set {49}
OMP: pid 27599 tid 27761 thread 64 bound to OS proc set {70}
OMP: pid 27599 tid 27764 thread 67 bound to OS proc set {73}
OMP: pid 27599 tid 27743 thread 46 bound to OS proc set {50}
OMP: pid 27599 tid 27751 thread 54 bound to OS proc set {59}
OMP: pid 27599 tid 27754 thread 57 bound to OS proc set {62}
OMP: pid 27599 tid 27744 thread 47 bound to OS proc set {51}
OMP: pid 27599 tid 27750 thread 53 bound to OS proc set {58}
OMP: pid 27599 tid 27752 thread 55 bound to OS proc set {60}
OMP: pid 27599 tid 27767 thread 70 bound to OS proc set {77}
OMP: pid 27599 tid 27759 thread 62 bound to OS proc set {68}
OMP: pid 27599 tid 27755 thread 58 bound to OS proc set {63}
OMP: pid 27599 tid 27753 thread 56 bound to OS proc set {61}
OMP: pid 27599 tid 27757 thread 60 bound to OS proc set {66}
OMP: pid 27599 tid 27756 thread 59 bound to OS proc set {65}
OMP: pid 27599 tid 27758 thread 61 bound to OS proc set {67}
OMP: pid 27599 tid 27760 thread 63 bound to OS proc set {69}
OMP: pid 27599 tid 27766 thread 69 bound to OS proc set {76}
OMP: pid 27599 tid 27763 thread 66 bound to OS proc set {72}
OMP: pid 27599 tid 27771 thread 74 bound to OS proc set {81}
OMP: pid 27599 tid 27778 thread 81 bound to OS proc set {89}
OMP: pid 27599 tid 27769 thread 72 bound to OS proc set {79}
OMP: pid 27599 tid 27773 thread 76 bound to OS proc set {83}
OMP: pid 27599 tid 27770 thread 73 bound to OS proc set {80}
OMP: pid 27599 tid 27779 thread 82 bound to OS proc set {90}
OMP: pid 27599 tid 27772 thread 75 bound to OS proc set {82}
OMP: pid 27599 tid 27775 thread 78 bound to OS proc set {85}
OMP: pid 27599 tid 27765 thread 68 bound to OS proc set {74}
OMP: pid 27599 tid 27768 thread 71 bound to OS proc set {78}
OMP: pid 27599 tid 27780 thread 83 bound to OS proc set {91}
OMP: pid 27599 tid 27774 thread 77 bound to OS proc set {84}
OMP: pid 27599 tid 27777 thread 80 bound to OS proc set {88}
OMP: pid 27599 tid 27776 thread 79 bound to OS proc set {87}
OMP: pid 27599 tid 27782 thread 85 bound to OS proc set {93}
OMP: pid 27599 tid 27781 thread 84 bound to OS proc set {92}
OMP: pid 27599 tid 27783 thread 86 bound to OS proc set {94}
OMP: pid 27599 tid 27784 thread 87 bound to OS proc set {95}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 88, "n_threads_batch": 88, "pp": 0, "tg": 128, "pl": 16, "n_kv": 2048, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 5.011001, "speed_tg": 408.700775, "t": 5.011001, "speed": 408.700775}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_13
To display your profiling results:
############################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
############################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_13 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_13 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_13 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_13 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_13 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_13 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_13 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_13 #
############################################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-47-249.ec2.internal.
If this is incorrect, rerun with number-processes-per-node=X
[0mOMP: pid 27806 tid 27907 thread 3 bound to OS proc set {3}
OMP: pid 27806 tid 27909 thread 5 bound to OS proc set {5}
OMP: pid 27806 tid 27908 thread 4 bound to OS proc set {4}
OMP: pid 27806 tid 27910 thread 6 bound to OS proc set {6}
OMP: pid 27806 tid 27806 thread 0 bound to OS proc set {0}
OMP: pid 27806 tid 27912 thread 8 bound to OS proc set {8}
OMP: pid 27806 tid 27913 thread 9 bound to OS proc set {9}
OMP: pid 27806 tid 27911 thread 7 bound to OS proc set {7}
OMP: pid 27806 tid 27905 thread 1 bound to OS proc set {1}
OMP: pid 27806 tid 27906 thread 2 bound to OS proc set {2}
OMP: pid 27806 tid 27914 thread 10 bound to OS proc set {10}
OMP: pid 27806 tid 27918 thread 14 bound to OS proc set {14}
OMP: pid 27806 tid 27917 thread 13 bound to OS proc set {13}
OMP: pid 27806 tid 27916 thread 12 bound to OS proc set {12}
OMP: pid 27806 tid 27915 thread 11 bound to OS proc set {11}
OMP: pid 27806 tid 27919 thread 15 bound to OS proc set {15}
OMP: pid 27806 tid 27923 thread 19 bound to OS proc set {19}
OMP: pid 27806 tid 27937 thread 33 bound to OS proc set {33}
OMP: pid 27806 tid 27922 thread 18 bound to OS proc set {18}
OMP: pid 27806 tid 27953 thread 49 bound to OS proc set {49}
OMP: pid 27806 tid 27920 thread 16 bound to OS proc set {16}
OMP: pid 27806 tid 27921 thread 17 bound to OS proc set {17}
OMP: pid 27806 tid 27939 thread 35 bound to OS proc set {35}
OMP: pid 27806 tid 27954 thread 50 bound to OS proc set {50}
OMP: pid 27806 tid 27955 thread 51 bound to OS proc set {51}
OMP: pid 27806 tid 27938 thread 34 bound to OS proc set {34}
OMP: pid 27806 tid 27926 thread 22 bound to OS proc set {22}
OMP: pid 27806 tid 27970 thread 66 bound to OS proc set {66}
OMP: pid 27806 tid 27924 thread 20 bound to OS proc set {20}
OMP: pid 27806 tid 27925 thread 21 bound to OS proc set {21}
OMP: pid 27806 tid 27940 thread 36 bound to OS proc set {36}
OMP: pid 27806 tid 27971 thread 67 bound to OS proc set {67}
OMP: pid 27806 tid 27942 thread 38 bound to OS proc set {38}
OMP: pid 27806 tid 27929 thread 25 bound to OS proc set {25}
OMP: pid 27806 tid 27928 thread 24 bound to OS proc set {24}
OMP: pid 27806 tid 27936 thread 32 bound to OS proc set {32}
OMP: pid 27806 tid 27934 thread 30 bound to OS proc set {30}
OMP: pid 27806 tid 27932 thread 28 bound to OS proc set {28}
OMP: pid 27806 tid 27945 thread 41 bound to OS proc set {41}
OMP: pid 27806 tid 27944 thread 40 bound to OS proc set {40}
OMP: pid 27806 tid 27927 thread 23 bound to OS proc set {23}
OMP: pid 27806 tid 27943 thread 39 bound to OS proc set {39}
OMP: pid 27806 tid 27933 thread 29 bound to OS proc set {29}
OMP: pid 27806 tid 27952 thread 48 bound to OS proc set {48}
OMP: pid 27806 tid 27948 thread 44 bound to OS proc set {44}
OMP: pid 27806 tid 27961 thread 57 bound to OS proc set {57}
OMP: pid 27806 tid 27949 thread 45 bound to OS proc set {45}
OMP: pid 27806 tid 27946 thread 42 bound to OS proc set {42}
OMP: pid 27806 tid 27958 thread 54 bound to OS proc set {54}
OMP: pid 27806 tid 27973 thread 69 bound to OS proc set {69}
OMP: pid 27806 tid 27962 thread 58 bound to OS proc set {58}
OMP: pid 27806 tid 27957 thread 53 bound to OS proc set {53}
OMP: pid 27806 tid 27930 thread 26 bound to OS proc set {26}
OMP: pid 27806 tid 27941 thread 37 bound to OS proc set {37}
OMP: pid 27806 tid 27977 thread 73 bound to OS proc set {73}
OMP: pid 27806 tid 27950 thread 46 bound to OS proc set {46}
OMP: pid 27806 tid 27947 thread 43 bound to OS proc set {43}
OMP: pid 27806 tid 27951 thread 47 bound to OS proc set {47}
OMP: pid 27806 tid 27956 thread 52 bound to OS proc set {52}
OMP: pid 27806 tid 27978 thread 74 bound to OS proc set {74}
OMP: pid 27806 tid 27931 thread 27 bound to OS proc set {27}
OMP: pid 27806 tid 27981 thread 77 bound to OS proc set {77}
OMP: pid 27806 tid 27969 thread 65 bound to OS proc set {65}
OMP: pid 27806 tid 27985 thread 81 bound to OS proc set {81}
OMP: pid 27806 tid 27964 thread 60 bound to OS proc set {60}
OMP: pid 27806 tid 27965 thread 61 bound to OS proc set {61}
OMP: pid 27806 tid 27980 thread 76 bound to OS proc set {76}
OMP: pid 27806 tid 27972 thread 68 bound to OS proc set {68}
OMP: pid 27806 tid 27968 thread 64 bound to OS proc set {64}
OMP: pid 27806 tid 27982 thread 78 bound to OS proc set {78}
OMP: pid 27806 tid 27986 thread 82 bound to OS proc set {82}
OMP: pid 27806 tid 27976 thread 72 bound to OS proc set {72}
OMP: pid 27806 tid 27975 thread 71 bound to OS proc set {71}
OMP: pid 27806 tid 27974 thread 70 bound to OS proc set {70}
OMP: pid 27806 tid 27959 thread 55 bound to OS proc set {55}
OMP: pid 27806 tid 27935 thread 31 bound to OS proc set {31}
OMP: pid 27806 tid 27987 thread 83 bound to OS proc set {83}
OMP: pid 27806 tid 27966 thread 62 bound to OS proc set {62}
OMP: pid 27806 tid 27963 thread 59 bound to OS proc set {59}
OMP: pid 27806 tid 27967 thread 63 bound to OS proc set {63}
OMP: pid 27806 tid 27984 thread 80 bound to OS proc set {80}
OMP: pid 27806 tid 27979 thread 75 bound to OS proc set {75}
OMP: pid 27806 tid 27960 thread 56 bound to OS proc set {56}
OMP: pid 27806 tid 27992 thread 88 bound to OS proc set {88}
OMP: pid 27806 tid 27983 thread 79 bound to OS proc set {79}
OMP: pid 27806 tid 27993 thread 89 bound to OS proc set {89}
OMP: pid 27806 tid 27989 thread 85 bound to OS proc set {85}
OMP: pid 27806 tid 27998 thread 94 bound to OS proc set {94}
OMP: pid 27806 tid 27996 thread 92 bound to OS proc set {92}
OMP: pid 27806 tid 27997 thread 93 bound to OS proc set {93}
OMP: pid 27806 tid 27994 thread 90 bound to OS proc set {90}
OMP: pid 27806 tid 27995 thread 91 bound to OS proc set {91}
OMP: pid 27806 tid 27990 thread 86 bound to OS proc set {86}
OMP: pid 27806 tid 27991 thread 87 bound to OS proc set {87}
OMP: pid 27806 tid 27999 thread 95 bound to OS proc set {95}
OMP: pid 27806 tid 27988 thread 84 bound to OS proc set {84}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 96, "n_threads_batch": 96, "pp": 0, "tg": 128, "pl": 16, "n_kv": 2048, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 5.009973, "speed_tg": 408.784637, "t": 5.009973, "speed": 408.784637}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_14
To display your profiling results:
############################################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
############################################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_14 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_14 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_14 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_14 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_14 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_14 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_14 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-47-249.ec2.internal/176-138-2040/llama.cpp/run/oneview_runs/multicore/armclang_3/oneview_results_1761383857/tools/lprof_npsu_run_14 #
############################################################################################################################################################################################################################################