Run Neoverse V1 ACFL Ofast (base) | Run Neoverse V1 ACFL Ofast SoA | Run Neoverse V1 ACFL Ofast Manual Unroll |
| - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 114-123
| | | | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 113-141
|
ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s |
15 | 70.87 | 1.49 | 1.75 | 64 | 8.16 | 0.18 | 17.04 | | 14 | 63.96 | 1.09 | 1.23 | 64 | 5.41 | 0.06 | 10.60 |
Run Neoverse V1 ACFL Ofast (base) | Run Neoverse V1 ACFL Ofast SoA | Run Neoverse V1 ACFL Ofast Manual Unroll |
| | | | | |
ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s |
1 | 0.05 | 0.00 | 0.01 | 12 | 0.02 | 0.00 | 0.00 | 1 | 0.07 | 0.00 | 0.01 | 16 | 0.07 | 0.00 | 0.00 | 1 | 0.11 | 0.00 | 0.02 | 18 | 0.24 | 0.00 | 0.00 |
528 | 0.99 | 0.02 | 0.05 | 62 | 0.56 | 0.01 | 0.00 | 528 | 1.10 | 0.02 | 0.05 | 63 | 0.55 | 0.01 | 0.00 | 528 | 1.44 | 0.02 | 0.06 | 64 | 0.72 | 0.01 | 0.00 |
1241 | 0.04 | 0.00 | 0.01 | 8 | 0.11 | 0.00 | 0.00 | 1241 | 0.03 | 0.00 | 0.01 | 6 | 0.01 | 0.00 | 0.00 | 1241 | 0.05 | 0.00 | 0.01 | 10 | 0.10 | 0.00 | 0.00 |
854 | 19.33 | 0.41 | 0.63 | 64 | 6.21 | 0.11 | 0.00 | 854 | 20.97 | 0.41 | 0.47 | 63 | 2.03 | 0.04 | 0.00 | 854 | 24.60 | 0.42 | 0.47 | 64 | 2.82 | 0.05 | 0.00 |
437 | 0.00 | 0.00 | 0.00 | 1 | 0.00 | 0.00 | 0.00 | 1137 | 1.04 | 0.02 | 0.06 | 62 | 0.53 | 0.01 | 0.00 | 1892 | 0.00 | 0.00 | 0.00 | 1 | 0.00 | 0.00 | 0.00 |
1137 | 0.83 | 0.02 | 0.04 | 59 | 0.53 | 0.01 | 0.00 | -1 | 0.00 | 0.00 | 0.00 | 1 | 0.00 | 0.00 | 0.00 | 1137 | 1.27 | 0.02 | 0.04 | 63 | 0.64 | 0.01 | 0.00 |
-1 | 0.00 | 0.00 | 0.00 | 3 | 0.00 | 0.00 | NA | -1 | 0.00 | 0.00 | 0.00 | 2 | 0.00 | 0.00 | NA | -1 | 0.00 | 0.00 | 0.00 | 8 | 0.00 | 0.00 | NA |
Run Neoverse V1 ACFL Ofast (base) | Run Neoverse V1 ACFL Ofast SoA | Run Neoverse V1 ACFL Ofast Manual Unroll |
| | | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 67-78
| | |
ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s |
| 14 | 70.45 | 1.36 | 1.59 | 64 | 6.40 | 0.10 | 8.73 | |
Run Neoverse V1 ACFL Ofast (base) | Run Neoverse V1 ACFL Ofast SoA | Run Neoverse V1 ACFL Ofast Manual Unroll |
| | | | | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 155-160
|
ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s |
| | 16 | 8.57 | 0.15 | 0.20 | 63 | 3.21 | 0.05 | 1.51 |
Run Neoverse V1 ACFL Ofast (base) | Run Neoverse V1 ACFL Ofast SoA | Run Neoverse V1 ACFL Ofast Manual Unroll |
| - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 139-144
| | | | |
ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s |
17 | 7.87 | 0.17 | 0.19 | 64 | 2.42 | 0.05 | 1.20 | | |
Run Neoverse V1 ACFL Ofast (base) | Run Neoverse V1 ACFL Ofast SoA | Run Neoverse V1 ACFL Ofast Manual Unroll |
| | | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 92-97
| | |
ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s |
| 16 | 6.34 | 0.12 | 0.19 | 63 | 3.73 | 0.07 | 1.47 | |
Name | Module | Coverage (%) | Inclusive Time w.r.t. Wall Time(s) | Max Inc. Time over Threads(s) | Nb Threads | GFLOP/s | Deviation (coverage) | Deviation (time) |
Neoverse V1 ACFL Ofast (base) | Neoverse V1 ACFL Ofast SoA | Neoverse V1 ACFL Ofast Manual Unroll | Neoverse V1 ACFL Ofast (base) | Neoverse V1 ACFL Ofast SoA | Neoverse V1 ACFL Ofast Manual Unroll | Neoverse V1 ACFL Ofast (base) | Neoverse V1 ACFL Ofast SoA | Neoverse V1 ACFL Ofast Manual Unroll | Neoverse V1 ACFL Ofast (base) | Neoverse V1 ACFL Ofast SoA | Neoverse V1 ACFL Ofast Manual Unroll | Neoverse V1 ACFL Ofast (base) | Neoverse V1 ACFL Ofast SoA | Neoverse V1 ACFL Ofast Manual Unroll | Neoverse V1 ACFL Ofast (base) | Neoverse V1 ACFL Ofast SoA | Neoverse V1 ACFL Ofast Manual Unroll | Neoverse V1 ACFL Ofast (base) | Neoverse V1 ACFL Ofast SoA | Neoverse V1 ACFL Ofast Manual Unroll |
k_means(int, point_t*, point_t*, int*, int, int) [clone .omp_outlined] | binary | 70.87 | NA | 63.96 | 1.49 | NA | 1.09 | 1.75 | NA | 1.23 | 64 | NA | 64 | 17.04 | NA | 10.60 | 8.16 | NA | 5.41 | 0.18 | NA | 0.06 |
k_means(int, point_t&, point_t&, int*, int, int) [clone .omp_outlined] | binary | NA | 70.45 | NA | NA | 1.36 | NA | NA | 1.59 | NA | NA | 64 | NA | NA | 8.73 | NA | NA | 6.40 | NA | NA | 0.10 | NA |
kmp_flag_64<false, true>::wait(kmp_info*, int, void*) | libomp.so | 19.33 | 20.97 | 24.60 | 0.41 | 0.41 | 0.42 | 0.63 | 0.47 | 0.47 | 64 | 63 | 64 | 0.00 | 0.00 | 0.00 | 6.21 | 2.03 | 2.82 | 0.11 | 0.04 | 0.05 |
k_means(int, point_t*, point_t*, int*, int, int) [clone .omp_outlined.3] | binary | 7.87 | NA | 8.57 | 0.17 | NA | 0.15 | 0.19 | NA | 0.20 | 64 | NA | 63 | 1.20 | NA | 1.51 | 2.42 | NA | 3.21 | 0.05 | NA | 0.05 |
k_means(int, point_t&, point_t&, int*, int, int) [clone .omp_outlined.3] | binary | NA | 6.34 | NA | NA | 0.12 | NA | NA | 0.19 | NA | NA | 63 | NA | NA | 1.47 | NA | NA | 3.73 | NA | NA | 0.07 | NA |
kmp_flag_native<unsigned long long, (flag_type)1, true>::notdone_check() | libomp.so | 0.99 | 1.10 | 1.44 | 0.02 | 0.02 | 0.02 | 0.05 | 0.05 | 0.06 | 62 | 63 | 64 | 0.00 | 0.00 | 0.00 | 0.56 | 0.55 | 0.72 | 0.01 | 0.01 | 0.01 |
__sched_yield | libc.so.6 | 0.83 | 1.04 | 1.27 | 0.02 | 0.02 | 0.02 | 0.04 | 0.06 | 0.04 | 59 | 62 | 63 | 0.00 | 0.00 | 0.00 | 0.53 | 0.53 | 0.64 | 0.01 | 0.01 | 0.01 |
@plt_start@ | libomp.so | 0.05 | 0.07 | 0.11 | 0.00 | 0.00 | 0.00 | 0.01 | 0.01 | 0.02 | 12 | 16 | 18 | 0.00 | 0.00 | 0.00 | 0.02 | 0.07 | 0.24 | 0.00 | 0.00 | 0.00 |
__kmp_yield | libomp.so | 0.04 | 0.03 | 0.05 | 0.00 | 0.00 | 0.00 | 0.01 | 0.01 | 0.01 | 8 | 6 | 10 | 0.00 | 0.00 | 0.00 | 0.11 | 0.01 | 0.10 | 0.00 | 0.00 | 0.00 |
__aarch64_ldadd4_relax | libomp.so | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 |
__kmp_resume_if_soft_paused | libomp.so | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA |
unknown_function | binary | NA | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 1 | NA | NA | 0.00 | NA | NA | 0.00 | NA | NA | 0.00 | NA |
unknown_kernel_region | kernel | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 3 | 2 | 8 | NA | NA | NA | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |