| Run orig_default | Run gcc_default | Run icx_9 | Run gcc_1 |
| Loop Source Regions | | Loop Source Regions | - /beegfs/hackathon/users/eoseret/qaas_runs_test/isix02.benchmarkcenter.megware.com/177-218-1582/qe/build/qe/PW/src/h_psi.f90: 140-140
| Loop Source Regions | - /beegfs/hackathon/users/eoseret/qaas_runs_test/isix02.benchmarkcenter.megware.com/177-218-1582/qe/build/qe/PW/src/h_psi.f90: 140-140
| Loop Source Regions | - /beegfs/hackathon/users/eoseret/qaas_runs_test/isix02.benchmarkcenter.megware.com/177-218-1582/qe/build/qe/PW/src/h_psi.f90: 140-140
|
| ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | GFLOP/s |
| 7089 | 0.08 | 0.05 | 0.17 | 72.73 | 21.59 | 97.45 | 11811 | 0.09 | 0.06 | 0.16 | 92.86 | 40.18 | 23.56 | 7768 | 0.06 | 0.04 | 0.13 | 100 | 50 | 38.47 |
| | | |
| No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | Sum on 1 analyzed binary loop (exec - 7089) | Sum on 1 analyzed binary loop (exec - 11811) | Sum on 1 analyzed binary loop (exec - 7768) |
| Analysis | Count | Analysis | Count | Analysis | Count | Analysis | Count |
| | Loop Computation Issues | | Loop Computation Issues | | Loop Computation Issues | |
| | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 |
| | Data Access Issues | | Data Access Issues | | Data Access Issues | |
| | Presence of constant non-unit stride data access | 0 | Presence of constant non-unit stride data access | 0 | Presence of constant non-unit stride data access | 1 |
| | Presence of indirect access | 0 | Presence of indirect access | 1 | Presence of indirect access | 0 |
| | More than 10% of the vector loads instructions are unaligned | 1 | More than 10% of the vector loads instructions are unaligned | 1 | More than 10% of the vector loads instructions are unaligned | 1 |
| | Presence of expensive instructions: scatter/gather | 0 | Presence of expensive instructions: scatter/gather | 1 | Presence of expensive instructions: scatter/gather | 0 |
| | Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 |
| | Vectorization Roadblocks | | Vectorization Roadblocks | | Vectorization Roadblocks | |
| | Presence of constant non-unit stride data access | | Presence of constant non-unit stride data access | 0 | Presence of constant non-unit stride data access | 1 |
| | Presence of indirect access | | Presence of indirect access | 1 | Presence of indirect access | 0 |
| | Inefficient Vectorization | | Inefficient Vectorization | | Inefficient Vectorization | |
| | Presence of expensive instructions: scatter/gather | 0 | Presence of expensive instructions: scatter/gather | 1 | Presence of expensive instructions: scatter/gather | 0 |
| | Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 | Presence of special instructions executing on a single port | 1 |