| Run G++ O3 + Funroll | Run ACFL O3 + Funroll + Ffastmath | 
| Loop Source Regions | /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 60-67
 | Loop Source Regions | /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 61-67
 | 
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | 
|---|
| 5 | 0.12 | 0.61 | 7.16 | 11.76 | 47.79 | 73.79 | 9 | 1.15 | 1.82 | 20.35 | 41.67 | 75 | 410.03 | 
|  |  | 
| Sum on 1 analyzed binary loop (kmeans-gcc-O3-funroll - 5) | Sum on 1 analyzed binary loop (kmeans-acfl-O3-all - 9) | 
| Analysis | Count | Analysis | Count | 
|---|
| Loop Computation Issues |  | Loop Computation Issues |  | 
| Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 0 | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 | 
| Presence of a large number of scalar integer instructions | 1 | Presence of a large number of scalar integer instructions | 0 | 
| Control Flow Issues |  | Control Flow Issues |  | 
| Presence of 2 to 4 paths | 0 | Presence of 2 to 4 paths | 1 | 
| Vectorization Roadblocks |  | Vectorization Roadblocks |  | 
| Presence of 2 to 4 paths | 0 | Presence of 2 to 4 paths | 1 | 
| Presence of more than 4 paths | 1 | Presence of more than 4 paths | 0 | 
| Run G++ O3 + Funroll | Run ACFL O3 + Funroll + Ffastmath | 
| Loop Source Regions | /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 81-84
 | Loop Source Regions | /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 81-84
 | 
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | 
|---|
| 13 | 7.43 | 0.59 | 6.97 | 7.89 | 48.03 | 8.47 | 7 | 7.77 | 0.13 | 1.48 | 11.11 | 52.78 | 37.63 | 
|  |  | 
| Sum on 1 analyzed binary loop (kmeans-gcc-O3-funroll - 13) | Sum on 1 analyzed binary loop (kmeans-acfl-O3-all - 7) | 
| Analysis | Count | Analysis | Count | 
|---|
| Loop Computation Issues |  | Loop Computation Issues |  | 
| Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 | 
| Presence of a large number of scalar integer instructions | 1 | Presence of a large number of scalar integer instructions | 1 | 
| Data Access Issues |  | Data Access Issues |  | 
| Presence of constant non-unit stride data access | 1 | Presence of constant non-unit stride data access | 0 | 
| Presence of indirect access | 1 | Presence of indirect access | 1 | 
| Vectorization Roadblocks |  | Vectorization Roadblocks |  | 
| Presence of constant non-unit stride data access | 1 | Presence of constant non-unit stride data access | 0 | 
| Presence of indirect access | 1 | Presence of indirect access | 1 |