Run Neoverse V1 ACFL Ofast Base | Run Neoverse V1 ACFL Ofast SoA | Run Neoverse V1 ACFL Ofast Manual Unroll | Run Neoverse V1 ACFL Ofast Manual Unroll + SoA |
| - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 114-123
| | | | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 113-141
| | |
ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s |
15 | 70.87 | 1.49 | 1.75 | 64 | 8.16 | 0.18 | 17.04 | | 14 | 63.96 | 1.09 | 1.23 | 64 | 5.41 | 0.06 | 10.60 | |
Run Neoverse V1 ACFL Ofast Base | Run Neoverse V1 ACFL Ofast SoA | Run Neoverse V1 ACFL Ofast Manual Unroll | Run Neoverse V1 ACFL Ofast Manual Unroll + SoA |
| | | | | | | |
ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s |
1 | 0.05 | 0.00 | 0.01 | 12 | 0.02 | 0.00 | 0.00 | 1 | 0.07 | 0.00 | 0.01 | 16 | 0.07 | 0.00 | 0.00 | 1 | 0.11 | 0.00 | 0.02 | 18 | 0.24 | 0.00 | 0.00 | 1 | 0.08 | 0.00 | 0.01 | 15 | 0.11 | 0.00 | 0.00 |
528 | 0.99 | 0.02 | 0.05 | 62 | 0.56 | 0.01 | 0.00 | 528 | 1.10 | 0.02 | 0.05 | 63 | 0.55 | 0.01 | 0.00 | 528 | 1.44 | 0.02 | 0.06 | 64 | 0.72 | 0.01 | 0.00 | 528 | 0.92 | 0.02 | 0.04 | 61 | 0.59 | 0.01 | 0.00 |
1241 | 0.04 | 0.00 | 0.01 | 8 | 0.11 | 0.00 | 0.00 | 1241 | 0.03 | 0.00 | 0.01 | 6 | 0.01 | 0.00 | 0.00 | 1241 | 0.05 | 0.00 | 0.01 | 10 | 0.10 | 0.00 | 0.00 | 1901 | 0.00 | 0.00 | 0.00 | 1 | 0.00 | 0.00 | 0.00 |
854 | 19.33 | 0.41 | 0.63 | 64 | 6.21 | 0.11 | 0.00 | 854 | 20.97 | 0.41 | 0.47 | 63 | 2.03 | 0.04 | 0.00 | 854 | 24.60 | 0.42 | 0.47 | 64 | 2.82 | 0.05 | 0.00 | 1241 | 0.04 | 0.00 | 0.01 | 7 | 0.12 | 0.00 | 0.00 |
437 | 0.00 | 0.00 | 0.00 | 1 | 0.00 | 0.00 | 0.00 | 1137 | 1.04 | 0.02 | 0.06 | 62 | 0.53 | 0.01 | 0.00 | 1892 | 0.00 | 0.00 | 0.00 | 1 | 0.00 | 0.00 | 0.00 | 854 | 18.29 | 0.37 | 0.44 | 63 | 2.89 | 0.05 | 0.00 |
1137 | 0.83 | 0.02 | 0.04 | 59 | 0.53 | 0.01 | 0.00 | -1 | 0.00 | 0.00 | 0.00 | 1 | 0.00 | 0.00 | 0.00 | 1137 | 1.27 | 0.02 | 0.04 | 63 | 0.64 | 0.01 | 0.00 | 1885 | 0.00 | 0.00 | 0.01 | 1 | 0.00 | 0.00 | 0.00 |
-1 | 0.00 | 0.00 | 0.00 | 3 | 0.00 | 0.00 | NA | -1 | 0.00 | 0.00 | 0.00 | 2 | 0.00 | 0.00 | NA | -1 | 0.00 | 0.00 | 0.00 | 8 | 0.00 | 0.00 | NA | 1137 | 1.04 | 0.02 | 0.04 | 63 | 0.53 | 0.01 | 0.00 |
| | | 789 | 0.00 | 0.00 | 0.00 | 1 | 0.00 | 0.00 | 0.00 |
| | | -1 | 0.00 | 0.00 | 0.00 | 40 | 0.00 | 0.00 | NA |
Run Neoverse V1 ACFL Ofast Base | Run Neoverse V1 ACFL Ofast SoA | Run Neoverse V1 ACFL Ofast Manual Unroll | Run Neoverse V1 ACFL Ofast Manual Unroll + SoA |
| | | | | | | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 69-97
|
ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s |
| | | 14 | 71.74 | 1.45 | 1.50 | 64 | 6.17 | 0.14 | 379.40 |
Run Neoverse V1 ACFL Ofast Base | Run Neoverse V1 ACFL Ofast SoA | Run Neoverse V1 ACFL Ofast Manual Unroll | Run Neoverse V1 ACFL Ofast Manual Unroll + SoA |
| | | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 67-78
| | | | |
ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s |
| 14 | 70.45 | 1.36 | 1.59 | 64 | 6.40 | 0.10 | 8.73 | | |
Run Neoverse V1 ACFL Ofast Base | Run Neoverse V1 ACFL Ofast SoA | Run Neoverse V1 ACFL Ofast Manual Unroll | Run Neoverse V1 ACFL Ofast Manual Unroll + SoA |
| | | | | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 155-160
| | |
ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s |
| | 16 | 8.57 | 0.15 | 0.20 | 63 | 3.21 | 0.05 | 1.51 | |
Run Neoverse V1 ACFL Ofast Base | Run Neoverse V1 ACFL Ofast SoA | Run Neoverse V1 ACFL Ofast Manual Unroll | Run Neoverse V1 ACFL Ofast Manual Unroll + SoA |
| | | | | | | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 111-116
|
ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s |
| | | 16 | 7.89 | 0.16 | 0.21 | 63 | 2.88 | 0.05 | 27.07 |
Run Neoverse V1 ACFL Ofast Base | Run Neoverse V1 ACFL Ofast SoA | Run Neoverse V1 ACFL Ofast Manual Unroll | Run Neoverse V1 ACFL Ofast Manual Unroll + SoA |
| - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 139-144
| | | | | | |
ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s |
17 | 7.87 | 0.17 | 0.19 | 64 | 2.42 | 0.05 | 1.20 | | | |
Run Neoverse V1 ACFL Ofast Base | Run Neoverse V1 ACFL Ofast SoA | Run Neoverse V1 ACFL Ofast Manual Unroll | Run Neoverse V1 ACFL Ofast Manual Unroll + SoA |
| | | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 92-97
| | | | |
ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s |
| 16 | 6.34 | 0.12 | 0.19 | 63 | 3.73 | 0.07 | 1.47 | | |
Name | Module | Coverage (%) | Inclusive Time w.r.t. Wall Time(s) | Max Inc. Time over Threads(s) | Nb Threads | GFLOP/s | Deviation (coverage) | Deviation (time) |
Neoverse V1 ACFL Ofast Base | Neoverse V1 ACFL Ofast SoA | Neoverse V1 ACFL Ofast Manual Unroll | Neoverse V1 ACFL Ofast Manual Unroll + SoA | Neoverse V1 ACFL Ofast Base | Neoverse V1 ACFL Ofast SoA | Neoverse V1 ACFL Ofast Manual Unroll | Neoverse V1 ACFL Ofast Manual Unroll + SoA | Neoverse V1 ACFL Ofast Base | Neoverse V1 ACFL Ofast SoA | Neoverse V1 ACFL Ofast Manual Unroll | Neoverse V1 ACFL Ofast Manual Unroll + SoA | Neoverse V1 ACFL Ofast Base | Neoverse V1 ACFL Ofast SoA | Neoverse V1 ACFL Ofast Manual Unroll | Neoverse V1 ACFL Ofast Manual Unroll + SoA | Neoverse V1 ACFL Ofast Base | Neoverse V1 ACFL Ofast SoA | Neoverse V1 ACFL Ofast Manual Unroll | Neoverse V1 ACFL Ofast Manual Unroll + SoA | Neoverse V1 ACFL Ofast Base | Neoverse V1 ACFL Ofast SoA | Neoverse V1 ACFL Ofast Manual Unroll | Neoverse V1 ACFL Ofast Manual Unroll + SoA | Neoverse V1 ACFL Ofast Base | Neoverse V1 ACFL Ofast SoA | Neoverse V1 ACFL Ofast Manual Unroll | Neoverse V1 ACFL Ofast Manual Unroll + SoA |
k_means(int, point_t&, point_t&, int*, int, int) [clone .omp_outlined] | binary | NA | 70.45 | NA | 71.74 | NA | 1.36 | NA | 1.45 | NA | 1.59 | NA | 1.50 | NA | 64 | NA | 64 | NA | 8.73 | NA | 379.40 | NA | 6.40 | NA | 6.17 | NA | 0.10 | NA | 0.14 |
k_means(int, point_t*, point_t*, int*, int, int) [clone .omp_outlined] | binary | 70.87 | NA | 63.96 | NA | 1.49 | NA | 1.09 | NA | 1.75 | NA | 1.23 | NA | 64 | NA | 64 | NA | 17.04 | NA | 10.60 | NA | 8.16 | NA | 5.41 | NA | 0.18 | NA | 0.06 | NA |
kmp_flag_64<false, true>::wait(kmp_info*, int, void*) | libomp.so | 19.33 | 20.97 | 24.60 | 18.29 | 0.41 | 0.41 | 0.42 | 0.37 | 0.63 | 0.47 | 0.47 | 0.44 | 64 | 63 | 64 | 63 | 0.00 | 0.00 | 0.00 | 0.00 | 6.21 | 2.03 | 2.82 | 2.89 | 0.11 | 0.04 | 0.05 | 0.05 |
k_means(int, point_t*, point_t*, int*, int, int) [clone .omp_outlined.3] | binary | 7.87 | NA | 8.57 | NA | 0.17 | NA | 0.15 | NA | 0.19 | NA | 0.20 | NA | 64 | NA | 63 | NA | 1.20 | NA | 1.51 | NA | 2.42 | NA | 3.21 | NA | 0.05 | NA | 0.05 | NA |
k_means(int, point_t&, point_t&, int*, int, int) [clone .omp_outlined.3] | binary | NA | 6.34 | NA | 7.89 | NA | 0.12 | NA | 0.16 | NA | 0.19 | NA | 0.21 | NA | 63 | NA | 63 | NA | 1.47 | NA | 27.07 | NA | 3.73 | NA | 2.88 | NA | 0.07 | NA | 0.05 |
kmp_flag_native<unsigned long long, (flag_type)1, true>::notdone_check() | libomp.so | 0.99 | 1.10 | 1.44 | 0.92 | 0.02 | 0.02 | 0.02 | 0.02 | 0.05 | 0.05 | 0.06 | 0.04 | 62 | 63 | 64 | 61 | 0.00 | 0.00 | 0.00 | 0.00 | 0.56 | 0.55 | 0.72 | 0.59 | 0.01 | 0.01 | 0.01 | 0.01 |
__sched_yield | libc.so.6 | 0.83 | 1.04 | 1.27 | 1.04 | 0.02 | 0.02 | 0.02 | 0.02 | 0.04 | 0.06 | 0.04 | 0.04 | 59 | 62 | 63 | 63 | 0.00 | 0.00 | 0.00 | 0.00 | 0.53 | 0.53 | 0.64 | 0.53 | 0.01 | 0.01 | 0.01 | 0.01 |
@plt_start@ | libomp.so | 0.05 | 0.07 | 0.11 | 0.08 | 0.00 | 0.00 | 0.00 | 0.00 | 0.01 | 0.01 | 0.02 | 0.01 | 12 | 16 | 18 | 15 | 0.00 | 0.00 | 0.00 | 0.00 | 0.02 | 0.07 | 0.24 | 0.11 | 0.00 | 0.00 | 0.00 | 0.00 |
__kmp_yield | libomp.so | 0.04 | 0.03 | 0.05 | 0.04 | 0.00 | 0.00 | 0.00 | 0.00 | 0.01 | 0.01 | 0.01 | 0.01 | 8 | 6 | 10 | 7 | 0.00 | 0.00 | 0.00 | 0.00 | 0.11 | 0.01 | 0.10 | 0.12 | 0.00 | 0.00 | 0.00 | 0.00 |
__aarch64_ldadd4_relax | libomp.so | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA |
__kmp_resume_if_soft_paused | libomp.so | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA |
__default_morecore | libc.so.6 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 |
__kmp_invoke_microtask | libomp.so | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.01 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 |
__aarch64_ldadd8_acq_rel | libomp.so | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 |
unknown_function | binary | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 1 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA | NA | 0.00 | NA | NA |
unknown_kernel_region | kernel | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 3 | 2 | 8 | 40 | NA | NA | NA | NA | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |