options

Statistics

AnalysisCountPercentageWeighted Count
Loop Computation Issues20
Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA1155.000.35
Presence of a large number of scalar integer instructions525.000.13
Presence of expensive FP instructions210.000.03
Large loop body over microp cache size15.000.01
Bottleneck in the front-end15.000.01
Control Flow Issues8
Presence of calls315.000.20
Presence of 2 to 4 paths315.000.05
Presence of more than 4 paths15.000.02
Non-innermost loop15.000.01
Data Access Issues22
More than 20% of the loads are accessing the stack735.000.30
Presence of indirect access420.000.12
Presence of constant non-unit stride data access420.000.13
Presence of special instructions executing on a single port315.000.06
More than 10% of the vector loads instructions are unaligned315.000.06
Presence of expensive instructions: scatter/gather15.000.03
Vectorization Roadblocks19
Presence of more than 4 paths420.000.22
Presence of constant non-unit stride data access420.000.13
Presence of indirect access420.000.12
Presence of 2 to 4 paths315.000.05
Presence of calls315.000.20
Non-innermost loop15.000.01
Inefficient Vectorization5
Presence of special instructions executing on a single port315.000.06
Use of masked instructions15.000.01
Presence of expensive instructions: scatter/gather15.000.03

Details

Analysisr_1r_2
Loop Computation IssuesPresence of expensive FP instructions11
Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA56
Large loop body over microp cache size10
Presence of a large number of scalar integer instructions23
Bottleneck in the front-end10
Control Flow IssuesPresence of calls21
Presence of 2 to 4 paths21
Presence of more than 4 paths01
Non-innermost loop10
Data Access IssuesPresence of constant non-unit stride data access22
Presence of indirect access22
More than 10% of the vector loads instructions are unaligned30
Presence of expensive instructions: scatter/gather01
Presence of special instructions executing on a single port30
More than 20% of the loads are accessing the stack34
Vectorization RoadblocksPresence of calls21
Presence of 2 to 4 paths21
Presence of more than 4 paths22
Non-innermost loop10
Presence of constant non-unit stride data access22
Presence of indirect access22
Inefficient VectorizationPresence of expensive instructions: scatter/gather01
Presence of special instructions executing on a single port30
Use of masked instructions10
×