options

Statistics

AnalysisCountPercentageWeighted Count
Loop Computation Issues85
Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA4880.000.69
Presence of a large number of scalar integer instructions1728.330.25
Presence of expensive FP instructions1626.670.16
Large loop body over microp cache size23.330.03
Bottleneck in the front-end23.330.03
Control Flow Issues26
Presence of 2 to 4 paths1525.000.18
Presence of calls610.000.24
Non-innermost loop35.000.03
Presence of more than 4 paths23.330.03
Data Access Issues97
More than 10% of the vector loads instructions are unaligned3050.000.31
More than 20% of the loads are accessing the stack2338.330.45
Presence of indirect access1525.000.21
Presence of constant non-unit stride data access1220.000.19
Presence of special instructions executing on a single port915.000.12
Presence of expensive instructions: scatter/gather813.330.09
Vectorization Roadblocks62
Presence of indirect access1525.000.21
Presence of 2 to 4 paths1525.000.18
Presence of constant non-unit stride data access1220.000.19
Presence of calls610.000.24
Presence of more than 4 paths610.000.24
ERROR58.330.23
Non-innermost loop35.000.03
Inefficient Vectorization27
Use of masked instructions1016.670.10
Presence of special instructions executing on a single port915.000.12
Presence of expensive instructions: scatter/gather813.330.09

Details

Analysisr_1r_2
Loop Computation IssuesPresence of expensive FP instructions88
Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA2424
Large loop body over microp cache size11
Presence of a large number of scalar integer instructions611
Bottleneck in the front-end11
Control Flow IssuesPresence of calls24
Presence of 2 to 4 paths87
Presence of more than 4 paths11
Non-innermost loop21
Data Access IssuesPresence of constant non-unit stride data access57
Presence of indirect access78
More than 10% of the vector loads instructions are unaligned1713
Presence of expensive instructions: scatter/gather35
Presence of special instructions executing on a single port81
More than 20% of the loads are accessing the stack914
Vectorization RoadblocksPresence of calls24
Presence of 2 to 4 paths87
Presence of more than 4 paths33
Non-innermost loop21
Presence of constant non-unit stride data access57
Presence of indirect access78
ERROR32
Inefficient VectorizationPresence of expensive instructions: scatter/gather35
Presence of special instructions executing on a single port81
Use of masked instructions55
×