Run baseline | Run locus440 |
Loop Source Regions | - /home/eoseret/3dtransp_code/permute3d_1.omp.cpp: 92-93
| Loop Source Regions | |
ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
19 | 500.66 | 225.51 | 41.29 | 83.33 | 85.42 | |
| |
Sum on 1 analyzed binary loop (permute3d_1.omp.exe - 19) | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. |
Analysis | Count | Analysis | Count |
Data Access Issues | | | |
Presence of constant non-unit stride data access | 1 | | |
Presence of expensive instructions: scatter/gather | 1 | | |
Presence of special instructions executing on a single port | 1 | | |
Vectorization Roadblocks | | | |
Presence of constant non-unit stride data access | 1 | | |
Inefficient Vectorization | | | |
Presence of expensive instructions: scatter/gather | 1 | | |
Presence of special instructions executing on a single port | 1 | | |
Run baseline | Run locus440 |
Loop Source Regions | | Loop Source Regions | - /home/eoseret/3dtransp_code/permute3d_1.locus440.cpp: 103-104
|
ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
| 20 | 13.27 | 2.76 | 16.15 | 0 | 12.5 |
| |
No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | Sum on 1 analyzed binary loop (permute3d_1.locus440.exe - 20) |
Analysis | Count | Analysis | Count |
| | Data Access Issues | |
| | Presence of constant non-unit stride data access | 1 |
| | Vectorization Roadblocks | |
| | Presence of constant non-unit stride data access | 1 |