| Run Neoverse V1 ACFL Ofast Base | Run Neoverse V1 ACFL Ofast SoA | Run Neoverse V1 ACFL Ofast Manual Unroll | Run Neoverse V1 ACFL Ofast Manual Unroll + SoA | 
| Loop Source Regions | /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 117-123
 | Loop Source Regions |  | Loop Source Regions | /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 118-131
 | Loop Source Regions |  | 
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | 
|---|
| 7 | 1.74 | 1.47 | 70.23 | 41.67 | 37.5 | 17.02 |  | 8 | 1.16 | 1.02 | 60.13 | 35.71 | 38.39 | 10.73 |  | 
|  |  |  |  | 
| Sum on 1 analyzed binary loop (kmeans-acfl-Ofast - 7) | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | Sum on 1 analyzed binary loop (kmeans-acfl-Ofast - 8) | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | 
| Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | 
|---|
| Loop Computation Issues |  |  |  | Loop Computation Issues |  |  |  | 
| Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 |  |  | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 |  |  | 
| Control Flow Issues |  |  |  | Control Flow Issues |  |  |  | 
| Presence of 2 to 4 paths | 1 |  |  | Presence of 2 to 4 paths |  |  |  | 
| Vectorization Roadblocks |  |  |  | Vectorization Roadblocks |  |  |  | 
| Presence of 2 to 4 paths | 1 |  |  | Presence of 2 to 4 paths |  |  |  | 
| Run Neoverse V1 ACFL Ofast Base | Run Neoverse V1 ACFL Ofast SoA | Run Neoverse V1 ACFL Ofast Manual Unroll | Run Neoverse V1 ACFL Ofast Manual Unroll + SoA | 
| Loop Source Regions |  | Loop Source Regions |  | Loop Source Regions |  | Loop Source Regions | /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 74-87
 | 
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | 
|---|
|  |  |  | 8 | 1.39 | 1.35 | 66.79 | 11.76 | 25.74 | 382.71 | 
|  |  |  |  | 
| No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | Sum on 1 analyzed binary loop (kmeans-acfl-Ofast - 8) | 
| Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | 
|---|
| Run Neoverse V1 ACFL Ofast Base | Run Neoverse V1 ACFL Ofast SoA | Run Neoverse V1 ACFL Ofast Manual Unroll | Run Neoverse V1 ACFL Ofast Manual Unroll + SoA | 
| Loop Source Regions |  | Loop Source Regions | /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 71-76
 | Loop Source Regions |  | Loop Source Regions |  | 
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | 
|---|
|  | 8 | 1.51 | 1.28 | 66.10 | 12.5 | 26.56 | 8.73 |  |  | 
|  |  |  |  | 
| No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | Sum on 1 analyzed binary loop (kmeans-acfl-Ofast - 8) | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | 
| Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | 
|---|
| Run Neoverse V1 ACFL Ofast Base | Run Neoverse V1 ACFL Ofast SoA | Run Neoverse V1 ACFL Ofast Manual Unroll | Run Neoverse V1 ACFL Ofast Manual Unroll + SoA | 
| Loop Source Regions |  | Loop Source Regions |  | Loop Source Regions | /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 156-160
 | Loop Source Regions |  | 
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | 
|---|
|  |  | 39 | 0.20 | 0.15 | 8.57 | 0 | 20.83 | 1.51 |  | 
|  |  |  |  | 
| No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | Sum on 1 analyzed binary loop (kmeans-acfl-Ofast - 39) | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | 
| Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | 
|---|
|  |  |  |  | Loop Computation Issues |  |  |  | 
|  |  |  |  | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 |  |  | 
|  |  |  |  | Presence of a large number of scalar integer instructions | 1 |  |  | 
|  |  |  |  | Data Access Issues |  |  |  | 
|  |  |  |  | Presence of indirect access | 1 |  |  | 
|  |  |  |  | Vectorization Roadblocks |  |  |  | 
|  |  |  |  | Presence of indirect access | 1 |  |  | 
| Run Neoverse V1 ACFL Ofast Base | Run Neoverse V1 ACFL Ofast SoA | Run Neoverse V1 ACFL Ofast Manual Unroll | Run Neoverse V1 ACFL Ofast Manual Unroll + SoA | 
| Loop Source Regions |  | Loop Source Regions |  | Loop Source Regions |  | Loop Source Regions |  | 
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | 
|---|
|  |  |  | 38 | 0.21 | 0.16 | 7.89 | 0 | 0 | 27.07 | 
|  |  |  |  | 
| No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | 
| Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | 
|---|
| Run Neoverse V1 ACFL Ofast Base | Run Neoverse V1 ACFL Ofast SoA | Run Neoverse V1 ACFL Ofast Manual Unroll | Run Neoverse V1 ACFL Ofast Manual Unroll + SoA | 
| Loop Source Regions | /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 140-144
 | Loop Source Regions |  | Loop Source Regions |  | Loop Source Regions |  | 
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | 
|---|
| 25 | 0.19 | 0.17 | 7.87 | 0 | 20.83 | 1.2 |  |  |  | 
|  |  |  |  | 
| Sum on 1 analyzed binary loop (kmeans-acfl-Ofast - 25) | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | 
| Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | 
|---|
| Loop Computation Issues |  |  |  |  |  |  |  | 
| Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 |  |  |  |  |  |  | 
| Presence of a large number of scalar integer instructions | 1 |  |  |  |  |  |  | 
| Data Access Issues |  |  |  |  |  |  |  | 
| Presence of indirect access | 1 |  |  |  |  |  |  | 
| Vectorization Roadblocks |  |  |  |  |  |  |  | 
| Presence of indirect access | 1 |  |  |  |  |  |  | 
| Run Neoverse V1 ACFL Ofast Base | Run Neoverse V1 ACFL Ofast SoA | Run Neoverse V1 ACFL Ofast Manual Unroll | Run Neoverse V1 ACFL Ofast Manual Unroll + SoA | 
| Loop Source Regions |  | Loop Source Regions | /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 93-97
 | Loop Source Regions |  | Loop Source Regions |  | 
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | 
|---|
|  | 39 | 0.19 | 0.12 | 6.34 | 0 | 20.83 | 1.47 |  |  | 
|  |  |  |  | 
| No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | Sum on 1 analyzed binary loop (kmeans-acfl-Ofast - 39) | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | 
| Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | 
|---|
|  |  | Loop Computation Issues |  |  |  |  |  | 
|  |  | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 |  |  |  |  | 
|  |  | Presence of a large number of scalar integer instructions | 1 |  |  |  |  | 
|  |  | Data Access Issues |  |  |  |  |  | 
|  |  | Presence of indirect access | 1 |  |  |  |  | 
|  |  | Vectorization Roadblocks |  |  |  |  |  | 
|  |  | Presence of indirect access | 1 |  |  |  |  |