* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-18-66.
If this is incorrect, rerun with number-processes-per-node=X
[0mOMP: pid 17493 tid 17493 thread 0 bound to OS proc set {0}
OMP: pid 17493 tid 17560 thread 1 bound to OS proc set {16}
OMP: pid 17493 tid 17561 thread 2 bound to OS proc set {32}
OMP: pid 17493 tid 17562 thread 3 bound to OS proc set {48}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 4, "n_threads_batch": 4, "pp": 0, "tg": 128, "pl": 16, "n_kv": 2048, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 53.168232, "speed_tg": 38.519241, "t": 53.168232, "speed": 38.519241}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_2
To display your profiling results:
###########################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
###########################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_2 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_2 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_2 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_2 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_2 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_2 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_2 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_2 #
###########################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-18-66.
If this is incorrect, rerun with number-processes-per-node=X
[0mOMP: pid 17590 tid 17590 thread 0 bound to OS proc set {0}
OMP: pid 17590 tid 17658 thread 2 bound to OS proc set {16}
OMP: pid 17590 tid 17659 thread 3 bound to OS proc set {24}
OMP: pid 17590 tid 17660 thread 4 bound to OS proc set {32}
OMP: pid 17590 tid 17657 thread 1 bound to OS proc set {8}
OMP: pid 17590 tid 17661 thread 5 bound to OS proc set {40}
OMP: pid 17590 tid 17662 thread 6 bound to OS proc set {48}
OMP: pid 17590 tid 17663 thread 7 bound to OS proc set {56}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 8, "n_threads_batch": 8, "pp": 0, "tg": 128, "pl": 16, "n_kv": 2048, "t_pp": 0.000001, "speed_pp": 0.000000, "t_tg": 27.197651, "speed_tg": 75.300621, "t": 27.197653, "speed": 75.300613}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_3
To display your profiling results:
###########################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
###########################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_3 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_3 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_3 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_3 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_3 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_3 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_3 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_3 #
###########################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-18-66.
If this is incorrect, rerun with number-processes-per-node=X
[0mOMP: pid 17794 tid 17794 thread 0 bound to OS proc set {0}
OMP: pid 17794 tid 17862 thread 2 bound to OS proc set {8}
OMP: pid 17794 tid 17863 thread 3 bound to OS proc set {12}
OMP: pid 17794 tid 17861 thread 1 bound to OS proc set {4}
OMP: pid 17794 tid 17868 thread 8 bound to OS proc set {32}
OMP: pid 17794 tid 17864 thread 4 bound to OS proc set {16}
OMP: pid 17794 tid 17865 thread 5 bound to OS proc set {20}
OMP: pid 17794 tid 17870 thread 10 bound to OS proc set {40}
OMP: pid 17794 tid 17867 thread 7 bound to OS proc set {28}
OMP: pid 17794 tid 17866 thread 6 bound to OS proc set {24}
OMP: pid 17794 tid 17872 thread 12 bound to OS proc set {48}
OMP: pid 17794 tid 17869 thread 9 bound to OS proc set {36}
OMP: pid 17794 tid 17871 thread 11 bound to OS proc set {44}
OMP: pid 17794 tid 17874 thread 14 bound to OS proc set {56}
OMP: pid 17794 tid 17873 thread 13 bound to OS proc set {52}
OMP: pid 17794 tid 17875 thread 15 bound to OS proc set {60}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 16, "n_threads_batch": 16, "pp": 0, "tg": 128, "pl": 16, "n_kv": 2048, "t_pp": 0.000001, "speed_pp": 0.000000, "t_tg": 14.338287, "speed_tg": 142.834351, "t": 14.338288, "speed": 142.834351}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_4
To display your profiling results:
###########################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
###########################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_4 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_4 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_4 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_4 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_4 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_4 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_4 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_4 #
###########################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-18-66.
If this is incorrect, rerun with number-processes-per-node=X
[0mOMP: pid 17902 tid 17902 thread 0 bound to OS proc set {0}
OMP: pid 17902 tid 17971 thread 3 bound to OS proc set {8}
OMP: pid 17902 tid 17969 thread 1 bound to OS proc set {2}
OMP: pid 17902 tid 17970 thread 2 bound to OS proc set {5}
OMP: pid 17902 tid 17972 thread 4 bound to OS proc set {10}
OMP: pid 17902 tid 17975 thread 7 bound to OS proc set {18}
OMP: pid 17902 tid 17974 thread 6 bound to OS proc set {16}
OMP: pid 17902 tid 17976 thread 8 bound to OS proc set {21}
OMP: pid 17902 tid 17981 thread 13 bound to OS proc set {35}
OMP: pid 17902 tid 17979 thread 11 bound to OS proc set {29}
OMP: pid 17902 tid 17983 thread 15 bound to OS proc set {40}
OMP: pid 17902 tid 17986 thread 18 bound to OS proc set {48}
OMP: pid 17902 tid 17982 thread 14 bound to OS proc set {37}
OMP: pid 17902 tid 17985 thread 17 bound to OS proc set {46}
OMP: pid 17902 tid 17984 thread 16 bound to OS proc set {43}
OMP: pid 17902 tid 17973 thread 5 bound to OS proc set {13}
OMP: pid 17902 tid 17987 thread 19 bound to OS proc set {51}
OMP: pid 17902 tid 17977 thread 9 bound to OS proc set {24}
OMP: pid 17902 tid 17978 thread 10 bound to OS proc set {27}
OMP: pid 17902 tid 17980 thread 12 bound to OS proc set {32}
OMP: pid 17902 tid 17989 thread 21 bound to OS proc set {56}
OMP: pid 17902 tid 17988 thread 20 bound to OS proc set {54}
OMP: pid 17902 tid 17990 thread 22 bound to OS proc set {59}
OMP: pid 17902 tid 17991 thread 23 bound to OS proc set {62}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 24, "n_threads_batch": 24, "pp": 0, "tg": 128, "pl": 16, "n_kv": 2048, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 10.996930, "speed_tg": 186.233795, "t": 10.996930, "speed": 186.233795}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_5
To display your profiling results:
###########################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
###########################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_5 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_5 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_5 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_5 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_5 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_5 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_5 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_5 #
###########################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-18-66.
If this is incorrect, rerun with number-processes-per-node=X
[0mOMP: pid 18018 tid 18018 thread 0 bound to OS proc set {0}
OMP: pid 18018 tid 18086 thread 2 bound to OS proc set {4}
OMP: pid 18018 tid 18087 thread 3 bound to OS proc set {6}
OMP: pid 18018 tid 18096 thread 12 bound to OS proc set {24}
OMP: pid 18018 tid 18097 thread 13 bound to OS proc set {26}
OMP: pid 18018 tid 18088 thread 4 bound to OS proc set {8}
OMP: pid 18018 tid 18090 thread 6 bound to OS proc set {12}
OMP: pid 18018 tid 18101 thread 17 bound to OS proc set {34}
OMP: pid 18018 tid 18089 thread 5 bound to OS proc set {10}
OMP: pid 18018 tid 18092 thread 8 bound to OS proc set {16}
OMP: pid 18018 tid 18098 thread 14 bound to OS proc set {28}
OMP: pid 18018 tid 18085 thread 1 bound to OS proc set {2}
OMP: pid 18018 tid 18093 thread 9 bound to OS proc set {18}
OMP: pid 18018 tid 18091 thread 7 bound to OS proc set {14}
OMP: pid 18018 tid 18094 thread 10 bound to OS proc set {20}
OMP: pid 18018 tid 18099 thread 15 bound to OS proc set {30}
OMP: pid 18018 tid 18100 thread 16 bound to OS proc set {32}
OMP: pid 18018 tid 18103 thread 19 bound to OS proc set {38}
OMP: pid 18018 tid 18102 thread 18 bound to OS proc set {36}
OMP: pid 18018 tid 18095 thread 11 bound to OS proc set {22}
OMP: pid 18018 tid 18104 thread 20 bound to OS proc set {40}
OMP: pid 18018 tid 18105 thread 21 bound to OS proc set {42}
OMP: pid 18018 tid 18108 thread 24 bound to OS proc set {48}
OMP: pid 18018 tid 18106 thread 22 bound to OS proc set {44}
OMP: pid 18018 tid 18112 thread 28 bound to OS proc set {56}
OMP: pid 18018 tid 18110 thread 26 bound to OS proc set {52}
OMP: pid 18018 tid 18107 thread 23 bound to OS proc set {46}
OMP: pid 18018 tid 18109 thread 25 bound to OS proc set {50}
OMP: pid 18018 tid 18111 thread 27 bound to OS proc set {54}
OMP: pid 18018 tid 18113 thread 29 bound to OS proc set {58}
OMP: pid 18018 tid 18115 thread 31 bound to OS proc set {62}
OMP: pid 18018 tid 18114 thread 30 bound to OS proc set {60}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 32, "n_threads_batch": 32, "pp": 0, "tg": 128, "pl": 16, "n_kv": 2048, "t_pp": 0.000001, "speed_pp": 0.000000, "t_tg": 9.075353, "speed_tg": 225.666168, "t": 9.075354, "speed": 225.666138}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_6
To display your profiling results:
###########################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
###########################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_6 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_6 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_6 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_6 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_6 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_6 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_6 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_6 #
###########################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-18-66.
If this is incorrect, rerun with number-processes-per-node=X
[0mOMP: pid 18142 tid 18209 thread 1 bound to OS proc set {1}
OMP: pid 18142 tid 18142 thread 0 bound to OS proc set {0}
OMP: pid 18142 tid 18210 thread 2 bound to OS proc set {3}
OMP: pid 18142 tid 18211 thread 3 bound to OS proc set {4}
OMP: pid 18142 tid 18212 thread 4 bound to OS proc set {6}
OMP: pid 18142 tid 18214 thread 6 bound to OS proc set {9}
OMP: pid 18142 tid 18225 thread 17 bound to OS proc set {27}
OMP: pid 18142 tid 18218 thread 10 bound to OS proc set {16}
OMP: pid 18142 tid 18215 thread 7 bound to OS proc set {11}
OMP: pid 18142 tid 18213 thread 5 bound to OS proc set {8}
OMP: pid 18142 tid 18220 thread 12 bound to OS proc set {19}
OMP: pid 18142 tid 18219 thread 11 bound to OS proc set {17}
OMP: pid 18142 tid 18221 thread 13 bound to OS proc set {21}
OMP: pid 18142 tid 18226 thread 18 bound to OS proc set {29}
OMP: pid 18142 tid 18222 thread 14 bound to OS proc set {22}
OMP: pid 18142 tid 18217 thread 9 bound to OS proc set {14}
OMP: pid 18142 tid 18216 thread 8 bound to OS proc set {13}
OMP: pid 18142 tid 18227 thread 19 bound to OS proc set {30}
OMP: pid 18142 tid 18224 thread 16 bound to OS proc set {26}
OMP: pid 18142 tid 18223 thread 15 bound to OS proc set {24}
OMP: pid 18142 tid 18241 thread 33 bound to OS proc set {53}
OMP: pid 18142 tid 18240 thread 32 bound to OS proc set {52}
OMP: pid 18142 tid 18242 thread 34 bound to OS proc set {55}
OMP: pid 18142 tid 18243 thread 35 bound to OS proc set {56}
OMP: pid 18142 tid 18230 thread 22 bound to OS proc set {35}
OMP: pid 18142 tid 18228 thread 20 bound to OS proc set {32}
OMP: pid 18142 tid 18231 thread 23 bound to OS proc set {37}
OMP: pid 18142 tid 18236 thread 28 bound to OS proc set {45}
OMP: pid 18142 tid 18229 thread 21 bound to OS proc set {34}
OMP: pid 18142 tid 18233 thread 25 bound to OS proc set {40}
OMP: pid 18142 tid 18232 thread 24 bound to OS proc set {39}
OMP: pid 18142 tid 18238 thread 30 bound to OS proc set {48}
OMP: pid 18142 tid 18235 thread 27 bound to OS proc set {43}
OMP: pid 18142 tid 18237 thread 29 bound to OS proc set {47}
OMP: pid 18142 tid 18245 thread 37 bound to OS proc set {60}
OMP: pid 18142 tid 18244 thread 36 bound to OS proc set {58}
OMP: pid 18142 tid 18239 thread 31 bound to OS proc set {50}
OMP: pid 18142 tid 18234 thread 26 bound to OS proc set {42}
OMP: pid 18142 tid 18246 thread 38 bound to OS proc set {61}
OMP: pid 18142 tid 18247 thread 39 bound to OS proc set {63}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 40, "n_threads_batch": 40, "pp": 0, "tg": 128, "pl": 16, "n_kv": 2048, "t_pp": 0.000001, "speed_pp": 0.000000, "t_tg": 8.038020, "speed_tg": 254.789108, "t": 8.038021, "speed": 254.789078}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_7
To display your profiling results:
###########################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
###########################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_7 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_7 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_7 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_7 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_7 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_7 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_7 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_7 #
###########################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-18-66.
If this is incorrect, rerun with number-processes-per-node=X
[0mOMP: pid 18274 tid 18341 thread 1 bound to OS proc set {1}
OMP: pid 18274 tid 18274 thread 0 bound to OS proc set {0}
OMP: pid 18274 tid 18342 thread 2 bound to OS proc set {2}
OMP: pid 18274 tid 18346 thread 6 bound to OS proc set {8}
OMP: pid 18274 tid 18351 thread 11 bound to OS proc set {14}
OMP: pid 18274 tid 18348 thread 8 bound to OS proc set {10}
OMP: pid 18274 tid 18347 thread 7 bound to OS proc set {9}
OMP: pid 18274 tid 18350 thread 10 bound to OS proc set {13}
OMP: pid 18274 tid 18343 thread 3 bound to OS proc set {4}
OMP: pid 18274 tid 18352 thread 12 bound to OS proc set {16}
OMP: pid 18274 tid 18349 thread 9 bound to OS proc set {12}
OMP: pid 18274 tid 18355 thread 15 bound to OS proc set {20}
OMP: pid 18274 tid 18373 thread 33 bound to OS proc set {44}
OMP: pid 18274 tid 18357 thread 17 bound to OS proc set {23}
OMP: pid 18274 tid 18345 thread 5 bound to OS proc set {6}
OMP: pid 18274 tid 18374 thread 34 bound to OS proc set {46}
OMP: pid 18274 tid 18354 thread 14 bound to OS proc set {18}
OMP: pid 18274 tid 18353 thread 13 bound to OS proc set {17}
OMP: pid 18274 tid 18356 thread 16 bound to OS proc set {21}
OMP: pid 18274 tid 18372 thread 32 bound to OS proc set {43}
OMP: pid 18274 tid 18375 thread 35 bound to OS proc set {47}
OMP: pid 18274 tid 18344 thread 4 bound to OS proc set {5}
OMP: pid 18274 tid 18358 thread 18 bound to OS proc set {24}
OMP: pid 18274 tid 18359 thread 19 bound to OS proc set {25}
OMP: pid 18274 tid 18369 thread 29 bound to OS proc set {39}
OMP: pid 18274 tid 18362 thread 22 bound to OS proc set {29}
OMP: pid 18274 tid 18360 thread 20 bound to OS proc set {27}
OMP: pid 18274 tid 18368 thread 28 bound to OS proc set {37}
OMP: pid 18274 tid 18361 thread 21 bound to OS proc set {28}
OMP: pid 18274 tid 18363 thread 23 bound to OS proc set {31}
OMP: pid 18274 tid 18365 thread 25 bound to OS proc set {33}
OMP: pid 18274 tid 18366 thread 26 bound to OS proc set {35}
OMP: pid 18274 tid 18370 thread 30 bound to OS proc set {40}
OMP: pid 18274 tid 18364 thread 24 bound to OS proc set {32}
OMP: pid 18274 tid 18367 thread 27 bound to OS proc set {36}
OMP: pid 18274 tid 18376 thread 36 bound to OS proc set {48}
OMP: pid 18274 tid 18371 thread 31 bound to OS proc set {41}
OMP: pid 18274 tid 18381 thread 41 bound to OS proc set {55}
OMP: pid 18274 tid 18380 thread 40 bound to OS proc set {54}
OMP: pid 18274 tid 18377 thread 37 bound to OS proc set {50}
OMP: pid 18274 tid 18385 thread 45 bound to OS proc set {60}
OMP: pid 18274 tid 18378 thread 38 bound to OS proc set {51}
OMP: pid 18274 tid 18384 thread 44 bound to OS proc set {59}
OMP: pid 18274 tid 18379 thread 39 bound to OS proc set {52}
OMP: pid 18274 tid 18383 thread 43 bound to OS proc set {58}
OMP: pid 18274 tid 18386 thread 46 bound to OS proc set {62}
OMP: pid 18274 tid 18382 thread 42 bound to OS proc set {56}
OMP: pid 18274 tid 18387 thread 47 bound to OS proc set {63}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 48, "n_threads_batch": 48, "pp": 0, "tg": 128, "pl": 16, "n_kv": 2048, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 7.276100, "speed_tg": 281.469452, "t": 7.276100, "speed": 281.469452}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_8
To display your profiling results:
###########################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
###########################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_8 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_8 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_8 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_8 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_8 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_8 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_8 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_8 #
###########################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-18-66.
If this is incorrect, rerun with number-processes-per-node=X
[0mOMP: pid 18414 tid 18481 thread 1 bound to OS proc set {1}
OMP: pid 18414 tid 18483 thread 3 bound to OS proc set {3}
OMP: pid 18414 tid 18482 thread 2 bound to OS proc set {2}
OMP: pid 18414 tid 18414 thread 0 bound to OS proc set {0}
OMP: pid 18414 tid 18484 thread 4 bound to OS proc set {4}
OMP: pid 18414 tid 18486 thread 6 bound to OS proc set {6}
OMP: pid 18414 tid 18488 thread 8 bound to OS proc set {9}
OMP: pid 18414 tid 18485 thread 5 bound to OS proc set {5}
OMP: pid 18414 tid 18489 thread 9 bound to OS proc set {10}
OMP: pid 18414 tid 18494 thread 14 bound to OS proc set {16}
OMP: pid 18414 tid 18492 thread 12 bound to OS proc set {13}
OMP: pid 18414 tid 18490 thread 10 bound to OS proc set {11}
OMP: pid 18414 tid 18501 thread 21 bound to OS proc set {24}
OMP: pid 18414 tid 18487 thread 7 bound to OS proc set {8}
OMP: pid 18414 tid 18491 thread 11 bound to OS proc set {12}
OMP: pid 18414 tid 18498 thread 18 bound to OS proc set {20}
OMP: pid 18414 tid 18497 thread 17 bound to OS proc set {19}
OMP: pid 18414 tid 18529 thread 49 bound to OS proc set {56}
OMP: pid 18414 tid 18504 thread 24 bound to OS proc set {27}
OMP: pid 18414 tid 18500 thread 20 bound to OS proc set {23}
OMP: pid 18414 tid 18496 thread 16 bound to OS proc set {18}
OMP: pid 18414 tid 18502 thread 22 bound to OS proc set {25}
OMP: pid 18414 tid 18513 thread 33 bound to OS proc set {38}
OMP: pid 18414 tid 18495 thread 15 bound to OS proc set {17}
OMP: pid 18414 tid 18530 thread 50 bound to OS proc set {58}
OMP: pid 18414 tid 18512 thread 32 bound to OS proc set {37}
OMP: pid 18414 tid 18499 thread 19 bound to OS proc set {22}
OMP: pid 18414 tid 18493 thread 13 bound to OS proc set {15}
OMP: pid 18414 tid 18531 thread 51 bound to OS proc set {59}
OMP: pid 18414 tid 18514 thread 34 bound to OS proc set {39}
OMP: pid 18414 tid 18508 thread 28 bound to OS proc set {32}
OMP: pid 18414 tid 18528 thread 48 bound to OS proc set {55}
OMP: pid 18414 tid 18525 thread 45 bound to OS proc set {52}
OMP: pid 18414 tid 18515 thread 35 bound to OS proc set {40}
OMP: pid 18414 tid 18522 thread 42 bound to OS proc set {48}
OMP: pid 18414 tid 18517 thread 37 bound to OS proc set {42}
OMP: pid 18414 tid 18505 thread 25 bound to OS proc set {29}
OMP: pid 18414 tid 18524 thread 44 bound to OS proc set {51}
OMP: pid 18414 tid 18521 thread 41 bound to OS proc set {47}
OMP: pid 18414 tid 18503 thread 23 bound to OS proc set {26}
OMP: pid 18414 tid 18506 thread 26 bound to OS proc set {30}
OMP: pid 18414 tid 18527 thread 47 bound to OS proc set {54}
OMP: pid 18414 tid 18518 thread 38 bound to OS proc set {44}
OMP: pid 18414 tid 18516 thread 36 bound to OS proc set {41}
OMP: pid 18414 tid 18520 thread 40 bound to OS proc set {46}
OMP: pid 18414 tid 18509 thread 29 bound to OS proc set {33}
OMP: pid 18414 tid 18523 thread 43 bound to OS proc set {49}
OMP: pid 18414 tid 18526 thread 46 bound to OS proc set {53}
OMP: pid 18414 tid 18510 thread 30 bound to OS proc set {34}
OMP: pid 18414 tid 18511 thread 31 bound to OS proc set {35}
OMP: pid 18414 tid 18507 thread 27 bound to OS proc set {31}
OMP: pid 18414 tid 18534 thread 54 bound to OS proc set {62}
OMP: pid 18414 tid 18519 thread 39 bound to OS proc set {45}
OMP: pid 18414 tid 18533 thread 53 bound to OS proc set {61}
OMP: pid 18414 tid 18532 thread 52 bound to OS proc set {60}
OMP: pid 18414 tid 18535 thread 55 bound to OS proc set {63}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 56, "n_threads_batch": 56, "pp": 0, "tg": 128, "pl": 16, "n_kv": 2048, "t_pp": 0.000000, "speed_pp": nan, "t_tg": 6.789434, "speed_tg": 301.645172, "t": 6.789434, "speed": 301.645172}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_9
To display your profiling results:
###########################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
###########################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_9 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_9 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_9 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_9 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_9 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_9 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_9 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_9 #
###########################################################################################################################################################################################################################
* [MAQAO] Info: Detected 1 Lprof instances in ip-172-31-18-66.
If this is incorrect, rerun with number-processes-per-node=X
[0mOMP: pid 18562 tid 18629 thread 1 bound to OS proc set {1}
OMP: pid 18562 tid 18631 thread 3 bound to OS proc set {3}
OMP: pid 18562 tid 18630 thread 2 bound to OS proc set {2}
OMP: pid 18562 tid 18562 thread 0 bound to OS proc set {0}
OMP: pid 18562 tid 18632 thread 4 bound to OS proc set {4}
OMP: pid 18562 tid 18636 thread 8 bound to OS proc set {8}
OMP: pid 18562 tid 18633 thread 5 bound to OS proc set {5}
OMP: pid 18562 tid 18637 thread 9 bound to OS proc set {9}
OMP: pid 18562 tid 18640 thread 12 bound to OS proc set {12}
OMP: pid 18562 tid 18641 thread 13 bound to OS proc set {13}
OMP: pid 18562 tid 18638 thread 10 bound to OS proc set {10}
OMP: pid 18562 tid 18635 thread 7 bound to OS proc set {7}
OMP: pid 18562 tid 18642 thread 14 bound to OS proc set {14}
OMP: pid 18562 tid 18639 thread 11 bound to OS proc set {11}
OMP: pid 18562 tid 18643 thread 15 bound to OS proc set {15}
OMP: pid 18562 tid 18634 thread 6 bound to OS proc set {6}
OMP: pid 18562 tid 18646 thread 18 bound to OS proc set {18}
OMP: pid 18562 tid 18645 thread 17 bound to OS proc set {17}
OMP: pid 18562 tid 18644 thread 16 bound to OS proc set {16}
OMP: pid 18562 tid 18647 thread 19 bound to OS proc set {19}
OMP: pid 18562 tid 18677 thread 49 bound to OS proc set {49}
OMP: pid 18562 tid 18648 thread 20 bound to OS proc set {20}
OMP: pid 18562 tid 18662 thread 34 bound to OS proc set {34}
OMP: pid 18562 tid 18678 thread 50 bound to OS proc set {50}
OMP: pid 18562 tid 18663 thread 35 bound to OS proc set {35}
OMP: pid 18562 tid 18661 thread 33 bound to OS proc set {33}
OMP: pid 18562 tid 18660 thread 32 bound to OS proc set {32}
OMP: pid 18562 tid 18679 thread 51 bound to OS proc set {51}
OMP: pid 18562 tid 18649 thread 21 bound to OS proc set {21}
OMP: pid 18562 tid 18668 thread 40 bound to OS proc set {40}
OMP: pid 18562 tid 18676 thread 48 bound to OS proc set {48}
OMP: pid 18562 tid 18653 thread 25 bound to OS proc set {25}
OMP: pid 18562 tid 18650 thread 22 bound to OS proc set {22}
OMP: pid 18562 tid 18664 thread 36 bound to OS proc set {36}
OMP: pid 18562 tid 18666 thread 38 bound to OS proc set {38}
OMP: pid 18562 tid 18665 thread 37 bound to OS proc set {37}
OMP: pid 18562 tid 18657 thread 29 bound to OS proc set {29}
OMP: pid 18562 tid 18654 thread 26 bound to OS proc set {26}
OMP: pid 18562 tid 18680 thread 52 bound to OS proc set {52}
OMP: pid 18562 tid 18652 thread 24 bound to OS proc set {24}
OMP: pid 18562 tid 18658 thread 30 bound to OS proc set {30}
OMP: pid 18562 tid 18670 thread 42 bound to OS proc set {42}
OMP: pid 18562 tid 18669 thread 41 bound to OS proc set {41}
OMP: pid 18562 tid 18672 thread 44 bound to OS proc set {44}
OMP: pid 18562 tid 18673 thread 45 bound to OS proc set {45}
OMP: pid 18562 tid 18651 thread 23 bound to OS proc set {23}
OMP: pid 18562 tid 18671 thread 43 bound to OS proc set {43}
OMP: pid 18562 tid 18685 thread 57 bound to OS proc set {57}
OMP: pid 18562 tid 18681 thread 53 bound to OS proc set {53}
OMP: pid 18562 tid 18667 thread 39 bound to OS proc set {39}
OMP: pid 18562 tid 18656 thread 28 bound to OS proc set {28}
OMP: pid 18562 tid 18655 thread 27 bound to OS proc set {27}
OMP: pid 18562 tid 18684 thread 56 bound to OS proc set {56}
OMP: pid 18562 tid 18682 thread 54 bound to OS proc set {54}
OMP: pid 18562 tid 18659 thread 31 bound to OS proc set {31}
OMP: pid 18562 tid 18683 thread 55 bound to OS proc set {55}
OMP: pid 18562 tid 18688 thread 60 bound to OS proc set {60}
OMP: pid 18562 tid 18691 thread 63 bound to OS proc set {63}
OMP: pid 18562 tid 18689 thread 61 bound to OS proc set {61}
OMP: pid 18562 tid 18674 thread 46 bound to OS proc set {46}
OMP: pid 18562 tid 18686 thread 58 bound to OS proc set {58}
OMP: pid 18562 tid 18675 thread 47 bound to OS proc set {47}
OMP: pid 18562 tid 18690 thread 62 bound to OS proc set {62}
OMP: pid 18562 tid 18687 thread 59 bound to OS proc set {59}
{"n_kv_max": 16384, "n_batch": 2048, "n_ubatch": 512, "flash_attn": -1, "is_pp_shared": 0, "n_gpu_layers": -1, "n_threads": 64, "n_threads_batch": 64, "pp": 0, "tg": 128, "pl": 16, "n_kv": 2048, "t_pp": 0.000001, "speed_pp": 0.000000, "t_tg": 6.497191, "speed_tg": 315.213135, "t": 6.497192, "speed": 315.213104}
Your experiment path is /home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_10
To display your profiling results:
############################################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
############################################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_10 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_10 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_10 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_10 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_10 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_10 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_10 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/Tools/QaaS/qaas_runs/ip-172-31-18-66/176-138-1719/llama.cpp/run/oneview_runs/multicore/armclang/oneview_results_1761384116/tools/lprof_npsu_run_10 #
############################################################################################################################################################################################################################