Run Neoverse V1 GCC Ofast Manual Unroll + SoA | Run Neoverse V1 ACFL Ofast Manual Unroll + SoA |
| - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 69-97
| | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 69-97
|
ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s |
10 | 63.96 | 1.38 | 1.76 | 64 | 7.41 | 0.13 | 434.62 | 14 | 71.74 | 1.45 | 1.50 | 64 | 6.17 | 0.14 | 379.40 |
Run Neoverse V1 GCC Ofast Manual Unroll + SoA | Run Neoverse V1 ACFL Ofast Manual Unroll + SoA |
| | | |
ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s |
280 | 26.88 | 0.58 | 0.65 | 63 | 1.76 | 0.04 | 0.00 | 1 | 0.08 | 0.00 | 0.01 | 15 | 0.11 | 0.00 | 0.00 |
240 | 0.00 | 0.00 | 0.00 | 1 | 0.00 | 0.00 | 0.00 | 528 | 0.92 | 0.02 | 0.04 | 61 | 0.59 | 0.01 | 0.00 |
276 | 3.73 | 0.08 | 0.11 | 63 | 1.19 | 0.03 | 0.00 | 1901 | 0.00 | 0.00 | 0.00 | 1 | 0.00 | 0.00 | 0.00 |
-1 | 0.00 | 0.00 | 0.00 | 30 | 0.00 | 0.00 | NA | 1241 | 0.04 | 0.00 | 0.01 | 7 | 0.12 | 0.00 | 0.00 |
| 854 | 18.29 | 0.37 | 0.44 | 63 | 2.89 | 0.05 | 0.00 |
| 1885 | 0.00 | 0.00 | 0.01 | 1 | 0.00 | 0.00 | 0.00 |
| 1137 | 1.04 | 0.02 | 0.04 | 63 | 0.53 | 0.01 | 0.00 |
| 789 | 0.00 | 0.00 | 0.00 | 1 | 0.00 | 0.00 | 0.00 |
| -1 | 0.00 | 0.00 | 0.00 | 40 | 0.00 | 0.00 | NA |
Run Neoverse V1 GCC Ofast Manual Unroll + SoA | Run Neoverse V1 ACFL Ofast Manual Unroll + SoA |
| - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 111-116
| | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 111-116
|
ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s | ASM Fct ID | Coverage (%) | Inc Time w.r.t. Wall Time (s) | Max Inc. Time over Threads(s) | Nb Threads | Deviation (cov) | Deviation (tps) | GFLOP/s |
11 | 5.42 | 0.12 | 0.19 | 64 | 3.47 | 0.07 | 28.82 | 16 | 7.89 | 0.16 | 0.21 | 63 | 2.88 | 0.05 | 27.07 |
Name | Module | Coverage (%) | Inclusive Time w.r.t. Wall Time(s) | Max Inc. Time over Threads(s) | Nb Threads | GFLOP/s | Deviation (coverage) | Deviation (time) |
Neoverse V1 GCC Ofast Manual Unroll + SoA | Neoverse V1 ACFL Ofast Manual Unroll + SoA | Neoverse V1 GCC Ofast Manual Unroll + SoA | Neoverse V1 ACFL Ofast Manual Unroll + SoA | Neoverse V1 GCC Ofast Manual Unroll + SoA | Neoverse V1 ACFL Ofast Manual Unroll + SoA | Neoverse V1 GCC Ofast Manual Unroll + SoA | Neoverse V1 ACFL Ofast Manual Unroll + SoA | Neoverse V1 GCC Ofast Manual Unroll + SoA | Neoverse V1 ACFL Ofast Manual Unroll + SoA | Neoverse V1 GCC Ofast Manual Unroll + SoA | Neoverse V1 ACFL Ofast Manual Unroll + SoA | Neoverse V1 GCC Ofast Manual Unroll + SoA | Neoverse V1 ACFL Ofast Manual Unroll + SoA |
k_means(int, point_t&, point_t&, int*, int, int) [clone .omp_outlined] | binary | NA | 71.74 | NA | 1.45 | NA | 1.50 | NA | 64 | NA | 379.40 | NA | 6.17 | NA | 0.14 |
k_means(int, point_t&, point_t&, int*, int, int) [clone ._omp_fn.0] | binary | 63.96 | NA | 1.38 | NA | 1.76 | NA | 64 | NA | 434.62 | NA | 7.41 | NA | 0.13 | NA |
gomp_team_barrier_wait_end | libgomp.so.1.0.0 | 26.88 | NA | 0.58 | NA | 0.65 | NA | 63 | NA | 0.00 | NA | 1.76 | NA | 0.04 | NA |
kmp_flag_64<false, true>::wait(kmp_info*, int, void*) | libomp.so | NA | 18.29 | NA | 0.37 | NA | 0.44 | NA | 63 | NA | 0.00 | NA | 2.89 | NA | 0.05 |
k_means(int, point_t&, point_t&, int*, int, int) [clone .omp_outlined.3] | binary | NA | 7.89 | NA | 0.16 | NA | 0.21 | NA | 63 | NA | 27.07 | NA | 2.88 | NA | 0.05 |
k_means(int, point_t&, point_t&, int*, int, int) [clone ._omp_fn.1] | binary | 5.42 | NA | 0.12 | NA | 0.19 | NA | 64 | NA | 28.82 | NA | 3.47 | NA | 0.07 | NA |
gomp_barrier_wait_end | libgomp.so.1.0.0 | 3.73 | NA | 0.08 | NA | 0.11 | NA | 63 | NA | 0.00 | NA | 1.19 | NA | 0.03 | NA |
__sched_yield | libc.so.6 | NA | 1.04 | NA | 0.02 | NA | 0.04 | NA | 63 | NA | 0.00 | NA | 0.53 | NA | 0.01 |
kmp_flag_native<unsigned long long, (flag_type)1, true>::notdone_check() | libomp.so | NA | 0.92 | NA | 0.02 | NA | 0.04 | NA | 61 | NA | 0.00 | NA | 0.59 | NA | 0.01 |
@plt_start@ | libomp.so | NA | 0.08 | NA | 0.00 | NA | 0.01 | NA | 15 | NA | 0.00 | NA | 0.11 | NA | 0.00 |
__kmp_yield | libomp.so | NA | 0.04 | NA | 0.00 | NA | 0.01 | NA | 7 | NA | 0.00 | NA | 0.12 | NA | 0.00 |
__aarch64_ldadd8_acq_rel | libomp.so | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 1 | NA | 0.00 | NA | 0.00 | NA | 0.00 |
__default_morecore | libc.so.6 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA | 1 | NA | 0.00 | NA | 0.00 | NA | 0.00 |
__kmp_invoke_microtask | libomp.so | NA | 0.00 | NA | 0.00 | NA | 0.01 | NA | 1 | NA | 0.00 | NA | 0.00 | NA | 0.00 |
gomp_thread_start | libgomp.so.1.0.0 | 0.00 | NA | 0.00 | NA | 0.00 | NA | 1 | NA | 0.00 | NA | 0.00 | NA | 0.00 | NA |
unknown_kernel_region | kernel | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 30 | 40 | NA | NA | 0.00 | 0.00 | 0.00 | 0.00 |