Run o3_armpl/spmxv_large_grace_o1-144_m1_ov1_armclang_o3_armpl | Run o3_armpl/spmxv_large_grace_o1-144_m1_ov1_gcc_o3_armpl |
Loop Source Regions | - /home/it4i-hugobol/pop3/epi-spmxv/main.cpp: 201-203
| Loop Source Regions | - /home/it4i-hugobol/pop3/epi-spmxv/main.cpp: 201-203
|
ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
25 | 0.97 | 0.94 | 11.22 | 0 | 50 | 24 | 4.35 | 4.59 | 71.17 | 100 | 100 |
24 | 3.21 | 3.17 | 38.06 | 20 | 60 | |
| |
Sum on 2 analyzed binary loops (spmxv.exe - 25, spmxv.exe - 24) | Sum on 1 analyzed binary loop (spmxv.exe - 24) |
Analysis | Count | Analysis | Count |
Loop Computation Issues | | Loop Computation Issues | |
Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 |
Data Access Issues | | Data Access Issues | |
Presence of constant non-unit stride data access | 0 | Presence of constant non-unit stride data access | 1 |
Presence of indirect access | 1 | Presence of indirect access | 1 |
Vectorization Roadblocks | | Vectorization Roadblocks | |
Presence of constant non-unit stride data access | 0 | Presence of constant non-unit stride data access | 1 |
Presence of indirect access | 1 | Presence of indirect access | 1 |
Run o3_armpl/spmxv_large_grace_o1-144_m1_ov1_armclang_o3_armpl | Run o3_armpl/spmxv_large_grace_o1-144_m1_ov1_gcc_o3_armpl |
Loop Source Regions | | Loop Source Regions | - /home/it4i-hugobol/pop3/epi-spmxv/main.cpp: 188-195
- /home/it4i-hugobol/pop3/epi-spmxv/main.cpp: 205-208
|
ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
| 26 | 0.01 | 0.00 | 0.02 | 0 | 30 |
| |
No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | Sum on 1 analyzed binary loop (spmxv.exe - 26) |
Analysis | Count | Analysis | Count |
| | Loop Computation Issues | |
| | Presence of a large number of scalar integer instructions | 1 |
| | Control Flow Issues | |
| | Presence of calls | 1 |
| | Vectorization Roadblocks | |
| | Presence of calls | 1 |
| | Presence of more than 4 paths | 1 |
Run o3_armpl/spmxv_large_grace_o1-144_m1_ov1_armclang_o3_armpl | Run o3_armpl/spmxv_large_grace_o1-144_m1_ov1_gcc_o3_armpl |
Loop Source Regions | - /home/it4i-hugobol/pop3/epi-spmxv/main.cpp: 166-169
| Loop Source Regions | - /home/it4i-hugobol/pop3/epi-spmxv/main.cpp: 166-169
|
ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
19 | 0.00 | 0.00 | 0.00 | 0 | 40 | 21 | 0.01 | 0.00 | 0.00 | 100 | 100 |
| |
Sum on 1 analyzed binary loop (spmxv.exe - 19) | Sum on 1 analyzed binary loop (spmxv.exe - 21) |
Analysis | Count | Analysis | Count |
Loop Computation Issues | | Loop Computation Issues | |
Presence of a large number of scalar integer instructions | 1 | Presence of a large number of scalar integer instructions | |
Data Access Issues | | Data Access Issues | |
Presence of constant non-unit stride data access | | Presence of constant non-unit stride data access | 1 |
Vectorization Roadblocks | | Vectorization Roadblocks | |
Presence of constant non-unit stride data access | | Presence of constant non-unit stride data access | 1 |
Run o3_armpl/spmxv_large_grace_o1-144_m1_ov1_armclang_o3_armpl | Run o3_armpl/spmxv_large_grace_o1-144_m1_ov1_gcc_o3_armpl |
Loop Source Regions | - /home/it4i-hugobol/pop3/epi-spmxv/utils/ooo_cmdline.h: 94-99
- /home/it4i-hugobol/pop3/epi-spmxv/utils/ooo_cmdline.h: 105-108
- /usr/lib/gcc/aarch64-redhat-linux/11/../../../../include/c++/11/istream: 219-219
| Loop Source Regions | |
ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
188 | 0.02 | 0.00 | 0.00 | 0 | 43.81 | |
| |
Sum on 1 analyzed binary loop (spmxv.exe - 188) | No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. |
Analysis | Count | Analysis | Count |
Loop Computation Issues | | | |
Presence of a large number of scalar integer instructions | 1 | | |
Control Flow Issues | | | |
Presence of calls | 1 | | |
Presence of 2 to 4 paths | 1 | | |
Data Access Issues | | | |
Presence of constant non-unit stride data access | 1 | | |
Presence of indirect access | 1 | | |
Vectorization Roadblocks | | | |
Presence of calls | 1 | | |
Presence of 2 to 4 paths | 1 | | |
Presence of constant non-unit stride data access | 1 | | |
Presence of indirect access | 1 | | |
Run o3_armpl/spmxv_large_grace_o1-144_m1_ov1_armclang_o3_armpl | Run o3_armpl/spmxv_large_grace_o1-144_m1_ov1_gcc_o3_armpl |
Loop Source Regions | | Loop Source Regions | - /home/it4i-hugobol/pop3/epi-spmxv/utils/ooo_cmdline.cpp: 345-347
|
ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) | ASM Loop ID | Max Time Over Threads (s) | Time w.r.t. Wall Time (s) | Cov (%) | Vect. Ratio (%) | Vector Length Use (%) |
| 205 | 0.00 | 0.00 | 0.00 | 100 | 100 |
| |
No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count. | Sum on 1 analyzed binary loop (spmxv.exe - 205) |
Analysis | Count | Analysis | Count |
| | Loop Computation Issues | |
| | Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA | 1 |
| | Data Access Issues | |
| | Presence of constant non-unit stride data access | 1 |
| | Presence of indirect access | 1 |
| | Vectorization Roadblocks | |
| | Presence of constant non-unit stride data access | 1 |
| | Presence of indirect access | 1 |