Run Skylake GCC O2 | Run Skylake GCC O3 | Run Skylake GCC Ofast | Run Skylake Clang O2 | Run Skylake Clang O3 | Run Skylake Clang O3 + ffast-math | Run Skylake ICPX O2 | Run Skylake ICPX O3 | Run Skylake ICPX Ofast |
Loop Source Regions | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 117-123
| Loop Source Regions | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 117-123
| Loop Source Regions | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 117-123
| Loop Source Regions | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 117-123
| Loop Source Regions | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 117-123
| Loop Source Regions | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 117-123
| Loop Source Regions | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 117-123
| Loop Source Regions | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 117-123
| Loop Source Regions | - /home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 117-123
|
ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s |
1 | 104.00 | 88.79 | 82.64 | 7.14 | 12.95 | 75.01 | 0 | 103.81 | 88.07 | 82.44 | 7.14 | 12.95 | 75.14 | 0 | 12.58 | 13.26 | 86.38 | 58.57 | 19.38 | 73.58 | 8 | 13.78 | 12.46 | 85.51 | 6.25 | 12.5 | 68.03 | 8 | 13.89 | 12.45 | 85.36 | 6.25 | 12.5 | 67.93 | 8 | 11.01 | 12.07 | 85.20 | 55 | 18.59 | 68.49 | 26 | 13.44 | 11.07 | 78.72 | 57.89 | 18.86 | 78.12 | 26 | 13.78 | 11.09 | 79.16 | 57.89 | 18.86 | 78.42 | 26 | 10.02 | 10.74 | 78.66 | 57.89 | 18.86 | 78.6 |
| | | | | | | | |
Sum on 1 analyzed binary loop (kmeans-gcc-O2 - 1) | Sum on 1 analyzed binary loop (kmeans-gcc-O3 - 0) | Sum on 1 analyzed binary loop (kmeans-gcc-Ofast - 0) | Sum on 1 analyzed binary loop (kmeans-clang-O2 - 8) | Sum on 1 analyzed binary loop (kmeans-clang-O3 - 8) | Sum on 1 analyzed binary loop (kmeans-clang-O3-ffast-math - 8) | Sum on 1 analyzed binary loop (kmeans-icpx-O2 - 26) | Sum on 1 analyzed binary loop (kmeans-icpx-O3 - 26) | Sum on 1 analyzed binary loop (kmeans-icpx-Ofast - 26) |
Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count |
Loop Computation Issues | | Loop Computation Issues | | Loop Computation Issues | | Loop Computation Issues | | Loop Computation Issues | | Loop Computation Issues | | Loop Computation Issues | | Loop Computation Issues | | Loop Computation Issues | |
Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 0 | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 0 | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 |
Presence of a large number of scalar integer instructions | | Presence of a large number of scalar integer instructions | | Presence of a large number of scalar integer instructions | 1 | Presence of a large number of scalar integer instructions | 1 | Presence of a large number of scalar integer instructions | 1 | Presence of a large number of scalar integer instructions | 1 | Presence of a large number of scalar integer instructions | 0 | Presence of a large number of scalar integer instructions | 0 | Presence of a large number of scalar integer instructions | 0 |
Control Flow Issues | | Control Flow Issues | | Control Flow Issues | | Control Flow Issues | | Control Flow Issues | | Control Flow Issues | | Control Flow Issues | | Control Flow Issues | | Control Flow Issues | |
Presence of 2 to 4 paths | 1 | Presence of 2 to 4 paths | 1 | Presence of 2 to 4 paths | 1 | Presence of 2 to 4 paths | 1 | Presence of 2 to 4 paths | 1 | Presence of 2 to 4 paths | 1 | Presence of 2 to 4 paths | 0 | Presence of 2 to 4 paths | 0 | Presence of 2 to 4 paths | 0 |
Presence of more than 4 paths | 0 | Presence of more than 4 paths | 0 | Presence of more than 4 paths | 0 | Presence of more than 4 paths | 0 | Presence of more than 4 paths | 0 | Presence of more than 4 paths | 0 | Presence of more than 4 paths | 1 | Presence of more than 4 paths | 1 | Presence of more than 4 paths | 1 |
Data Access Issues | | Data Access Issues | | Data Access Issues | | Data Access Issues | | Data Access Issues | | Data Access Issues | | Data Access Issues | | Data Access Issues | | Data Access Issues | |
Presence of special instructions executing on a single port | | Presence of special instructions executing on a single port | | Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | | Presence of special instructions executing on a single port | | Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 |
Vectorization Roadblocks | | Vectorization Roadblocks | | Vectorization Roadblocks | | Vectorization Roadblocks | | Vectorization Roadblocks | | Vectorization Roadblocks | | Vectorization Roadblocks | | Vectorization Roadblocks | | Vectorization Roadblocks | |
Presence of 2 to 4 paths | 1 | Presence of 2 to 4 paths | 1 | Presence of 2 to 4 paths | 1 | Presence of 2 to 4 paths | 1 | Presence of 2 to 4 paths | 1 | Presence of 2 to 4 paths | 1 | Presence of 2 to 4 paths | 0 | Presence of 2 to 4 paths | 0 | Presence of 2 to 4 paths | 0 |
Presence of more than 4 paths | 0 | Presence of more than 4 paths | 0 | Presence of more than 4 paths | 0 | Presence of more than 4 paths | 0 | Presence of more than 4 paths | 0 | Presence of more than 4 paths | 0 | Presence of more than 4 paths | 1 | Presence of more than 4 paths | 1 | Presence of more than 4 paths | 1 |
Inefficient Vectorization | | Inefficient Vectorization | | Inefficient Vectorization | | Inefficient Vectorization | | Inefficient Vectorization | | Inefficient Vectorization | | Inefficient Vectorization | | Inefficient Vectorization | | Inefficient Vectorization | |
Presence of special instructions executing on a single port | | Presence of special instructions executing on a single port | | Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | | Presence of special instructions executing on a single port | | Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 |