options

engine_linuxa64_ompi - 2024-10-22 14:21:12 - MAQAO 2.20.9

Help is available by moving the cursor above any symbol or by checking MAQAO website.

Stylizer  

[ 4 / 4 ] Application profile is long enough (1089.09 s)

To have good quality measurements, it is advised that the application profiling time is greater than 10 seconds.

[ 3 / 3 ] Optimization level option is correctly used

[ 0 / 3 ] Most of time spent in analyzed modules comes from functions without compilation information

Functions without compilation information (typically not compiled with -g) cumulate 100.00% of the time spent in analyzed modules. Check that -g is present. Remark: if -g is indeed used, this can also be due to some compiler built-in functions (typically math) or statically linked libraries. This warning can be ignored in that case.

[ 2.99 / 3 ] Architecture specific option -march=armv8-a is used

[ 2 / 2 ] Application is correctly profiled ("Others" category represents 0.79 % of the execution time)

To have a representative profiling, it is advised that the category "Others" represents less than 20% of the execution time in order to analyze as much as possible of the user code

Optimizer

Loop IDAnalysisPenalty Score
Loop 5290 - engine_linuxa64_ompi+Execution Time: 9 % - Vectorization Ratio: 38.46 % - Vector Length Use: 32.69 %
Loop Computation Issues+4
[SA] Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA - Reorganize arithmetic expressions to exhibit potential for FMA. This issue costs 4 points.4
Loop 10194 - engine_linuxa64_ompi+Execution Time: 4 % - Vectorization Ratio: 35.71 % - Vector Length Use: 35.71 %
Loop Computation Issues+2
[SA] Presence of a large number of scalar integer instructions - Simplify loop structure, perform loop splitting or perform unroll and jam. This issue costs 2 points.2
Data Access Issues+36
[SA] Presence of constant non unit stride data access - Use array restructuring, perform loop interchange or use gather instructions to lower a bit the cost. There are 18 issues ( = data accesses) costing 2 point each.36
Vectorization Roadblocks+36
[SA] Presence of constant non unit stride data access - Use array restructuring, perform loop interchange or use gather instructions to lower a bit the cost. There are 18 issues ( = data accesses) costing 2 point each.36
Loop 9850 - engine_linuxa64_ompi+Execution Time: 3 % - Vectorization Ratio: 0.00 % - Vector Length Use: 24.72 %
Loop Computation Issues+2
[SA] Presence of a large number of scalar integer instructions - Simplify loop structure, perform loop splitting or perform unroll and jam. This issue costs 2 points.2
Data Access Issues+84
[SA] Presence of constant non unit stride data access - Use array restructuring, perform loop interchange or use gather instructions to lower a bit the cost. There are 24 issues ( = data accesses) costing 2 point each.48
[SA] Presence of indirect accesses - Use array restructuring or gather instructions to lower the cost. There are 9 issues ( = indirect data accesses) costing 4 point each.36
Vectorization Roadblocks+84
[SA] Presence of constant non unit stride data access - Use array restructuring, perform loop interchange or use gather instructions to lower a bit the cost. There are 24 issues ( = data accesses) costing 2 point each.48
[SA] Presence of indirect accesses - Use array restructuring or gather instructions to lower the cost. There are 9 issues ( = indirect data accesses) costing 4 point each.36
Loop 38994 - engine_linuxa64_ompiExecution Time: 3 % - Vectorization Ratio: 93.33 % - Vector Length Use: 48.33 %
Loop 5587 - engine_linuxa64_ompi+Execution Time: 1 % - Vectorization Ratio: 93.33 % - Vector Length Use: 65.00 %
Loop Computation Issues+4
[SA] Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA - Reorganize arithmetic expressions to exhibit potential for FMA. This issue costs 4 points.4
Loop 36824 - engine_linuxa64_ompi+Execution Time: 1 % - Vectorization Ratio: 10.61 % - Vector Length Use: 25.19 %
Loop Computation Issues+6
[SA] Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA - Reorganize arithmetic expressions to exhibit potential for FMA. This issue costs 4 points.4
[SA] Presence of a large number of scalar integer instructions - Simplify loop structure, perform loop splitting or perform unroll and jam. This issue costs 2 points.2
Control Flow Issues+2
[SA] Several paths (2 paths) - Simplify control structure or force the compiler to use masked instructions. There are 2 issues ( = paths) costing 1 point each.2
Data Access Issues+12
[SA] Presence of indirect accesses - Use array restructuring or gather instructions to lower the cost. There are 3 issues ( = indirect data accesses) costing 4 point each.12
Vectorization Roadblocks+14
[SA] Several paths (2 paths) - Simplify control structure or force the compiler to use masked instructions. There are 2 issues ( = paths) costing 1 point each.2
[SA] Presence of indirect accesses - Use array restructuring or gather instructions to lower the cost. There are 3 issues ( = indirect data accesses) costing 4 point each.12
Loop 9938 - engine_linuxa64_ompi+Execution Time: 1 % - Vectorization Ratio: 78.92 % - Vector Length Use: 45.03 %
Loop Computation Issues+6
[SA] Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA - Reorganize arithmetic expressions to exhibit potential for FMA. This issue costs 4 points.4
[SA] Presence of a large number of scalar integer instructions - Simplify loop structure, perform loop splitting or perform unroll and jam. This issue costs 2 points.2
Data Access Issues+84
[SA] Presence of constant non unit stride data access - Use array restructuring, perform loop interchange or use gather instructions to lower a bit the cost. There are 42 issues ( = data accesses) costing 2 point each.84
Vectorization Roadblocks+84
[SA] Presence of constant non unit stride data access - Use array restructuring, perform loop interchange or use gather instructions to lower a bit the cost. There are 42 issues ( = data accesses) costing 2 point each.84
Loop 38981 - engine_linuxa64_ompi+Execution Time: 1 % - Vectorization Ratio: 0.00 % - Vector Length Use: 23.08 %
Loop Computation Issues+6
[SA] Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA - Reorganize arithmetic expressions to exhibit potential for FMA. This issue costs 4 points.4
[SA] Presence of a large number of scalar integer instructions - Simplify loop structure, perform loop splitting or perform unroll and jam. This issue costs 2 points.2
Control Flow Issues+2
[SA] Several paths (2 paths) - Simplify control structure or force the compiler to use masked instructions. There are 2 issues ( = paths) costing 1 point each.2
Data Access Issues+2
[SA] Presence of constant non unit stride data access - Use array restructuring, perform loop interchange or use gather instructions to lower a bit the cost. There are 1 issues ( = data accesses) costing 2 point each.2
Vectorization Roadblocks+4
[SA] Several paths (2 paths) - Simplify control structure or force the compiler to use masked instructions. There are 2 issues ( = paths) costing 1 point each.2
[SA] Presence of constant non unit stride data access - Use array restructuring, perform loop interchange or use gather instructions to lower a bit the cost. There are 1 issues ( = data accesses) costing 2 point each.2
Loop 9877 - engine_linuxa64_ompi+Execution Time: 1 % - Vectorization Ratio: 77.86 % - Vector Length Use: 45.00 %
Loop Computation Issues+6
[SA] Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA - Reorganize arithmetic expressions to exhibit potential for FMA. This issue costs 4 points.4
[SA] Presence of a large number of scalar integer instructions - Simplify loop structure, perform loop splitting or perform unroll and jam. This issue costs 2 points.2
Data Access Issues+74
[SA] Presence of constant non unit stride data access - Use array restructuring, perform loop interchange or use gather instructions to lower a bit the cost. There are 37 issues ( = data accesses) costing 2 point each.74
Vectorization Roadblocks+74
[SA] Presence of constant non unit stride data access - Use array restructuring, perform loop interchange or use gather instructions to lower a bit the cost. There are 37 issues ( = data accesses) costing 2 point each.74
Loop 47760 - engine_linuxa64_ompi+Execution Time: 1 % - Vectorization Ratio: 2.11 % - Vector Length Use: 23.87 %
Loop Computation Issues+140
[SA] Large loop body: over microp cache size - Perform loop splitting or reduce unrolling. There are 68 issues (= chunks of 50 instructions) costing 2 point each.136
[SA] Presence of a large number of scalar integer instructions - Simplify loop structure, perform loop splitting or perform unroll and jam. This issue costs 2 points.2
[SA] Bottleneck in the front end - If loop size is very small (rare occurrences), perform unroll and jam. If loop size is large, perform loop splitting. This issue costs 2 points.2
Control Flow Issues+10
[SA] Presence of calls - Inline either by compiler or by hand and use SVML for libm calls. There are 8 issues (= calls) costing 1 point each.8
[SA] Non innermost loop (Outermost) - Collapse loop with innermost ones. This issue costs 2 points.2
Vectorization Roadblocks+1010
[SA] Presence of calls - Inline either by compiler or by hand and use SVML for libm calls. There are 8 issues (= calls) costing 1 point each.8
[SA] Too many paths (at least 1000 paths) - Simplify control structure. There are at least 1000 issues ( = paths) costing 1 point.1000
[SA] Non innermost loop (Outermost) - Collapse loop with innermost ones. This issue costs 2 points.2
Loop 9933 - engine_linuxa64_ompi+Execution Time: 1 % - Vectorization Ratio: 79.29 % - Vector Length Use: 44.82 %
Loop Computation Issues+6
[SA] Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA - Reorganize arithmetic expressions to exhibit potential for FMA. This issue costs 4 points.4
[SA] Presence of a large number of scalar integer instructions - Simplify loop structure, perform loop splitting or perform unroll and jam. This issue costs 2 points.2
Data Access Issues+60
[SA] Presence of constant non unit stride data access - Use array restructuring, perform loop interchange or use gather instructions to lower a bit the cost. There are 30 issues ( = data accesses) costing 2 point each.60
Vectorization Roadblocks+60
[SA] Presence of constant non unit stride data access - Use array restructuring, perform loop interchange or use gather instructions to lower a bit the cost. There are 30 issues ( = data accesses) costing 2 point each.60
Loop 10130 - engine_linuxa64_ompi+Execution Time: 1 % - Vectorization Ratio: 94.12 % - Vector Length Use: 48.53 %
Loop Computation Issues+4
[SA] Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA - Reorganize arithmetic expressions to exhibit potential for FMA. This issue costs 4 points.4
Data Access Issues+14
[SA] Presence of constant non unit stride data access - Use array restructuring, perform loop interchange or use gather instructions to lower a bit the cost. There are 7 issues ( = data accesses) costing 2 point each.14
Vectorization Roadblocks+14
[SA] Presence of constant non unit stride data access - Use array restructuring, perform loop interchange or use gather instructions to lower a bit the cost. There are 7 issues ( = data accesses) costing 2 point each.14
Loop 47929 - engine_linuxa64_ompi+Execution Time: 1 % - Vectorization Ratio: 94.12 % - Vector Length Use: 48.53 %
Loop Computation Issues+4
[SA] Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA - Reorganize arithmetic expressions to exhibit potential for FMA. This issue costs 4 points.4
Data Access Issues+8
[SA] Presence of constant non unit stride data access - Use array restructuring, perform loop interchange or use gather instructions to lower a bit the cost. There are 4 issues ( = data accesses) costing 2 point each.8
Vectorization Roadblocks+8
[SA] Presence of constant non unit stride data access - Use array restructuring, perform loop interchange or use gather instructions to lower a bit the cost. There are 4 issues ( = data accesses) costing 2 point each.8
Loop 10078 - engine_linuxa64_ompi+Execution Time: 1 % - Vectorization Ratio: 98.81 % - Vector Length Use: 49.70 %
Loop Computation Issues+52
[SA] Presence of expensive FP instructions - Perform hoisting, change algorithm, use SVML or proper numerical library or perform value profiling (count the number of distinct input values). There are 12 issues (= instructions) costing 4 points each.48
[SA] Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA - Reorganize arithmetic expressions to exhibit potential for FMA. This issue costs 4 points.4
Loop 47848 - engine_linuxa64_ompi+Execution Time: 1 % - Vectorization Ratio: 97.50 % - Vector Length Use: 49.38 %
Loop Computation Issues+4
[SA] Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA - Reorganize arithmetic expressions to exhibit potential for FMA. This issue costs 4 points.4
Data Access Issues+8
[SA] Presence of constant non unit stride data access - Use array restructuring, perform loop interchange or use gather instructions to lower a bit the cost. There are 4 issues ( = data accesses) costing 2 point each.8
Vectorization Roadblocks+8
[SA] Presence of constant non unit stride data access - Use array restructuring, perform loop interchange or use gather instructions to lower a bit the cost. There are 4 issues ( = data accesses) costing 2 point each.8
Loop 9888 - engine_linuxa64_ompi+Execution Time: 1 % - Vectorization Ratio: 0.00 % - Vector Length Use: 24.31 %
Control Flow Issues+2
[SA] Several paths (2 paths) - Simplify control structure or force the compiler to use masked instructions. There are 2 issues ( = paths) costing 1 point each.2
Vectorization Roadblocks+2
[SA] Several paths (2 paths) - Simplify control structure or force the compiler to use masked instructions. There are 2 issues ( = paths) costing 1 point each.2
Loop 5670 - engine_linuxa64_ompi+Execution Time: 1 % - Vectorization Ratio: 93.33 % - Vector Length Use: 61.67 %
Loop Computation Issues+4
[SA] Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA - Reorganize arithmetic expressions to exhibit potential for FMA. This issue costs 4 points.4
Loop 39163 - engine_linuxa64_ompi+Execution Time: 1 % - Vectorization Ratio: 93.33 % - Vector Length Use: 48.33 %
Loop Computation Issues+20
[SA] Presence of expensive FP instructions - Perform hoisting, change algorithm, use SVML or proper numerical library or perform value profiling (count the number of distinct input values). There are 4 issues (= instructions) costing 4 points each.16
[SA] Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA - Reorganize arithmetic expressions to exhibit potential for FMA. This issue costs 4 points.4
Loop 36823 - engine_linuxa64_ompi+Execution Time: 0 % - Vectorization Ratio: 10.94 % - Vector Length Use: 22.98 %
Loop Computation Issues+6
[SA] Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA - Reorganize arithmetic expressions to exhibit potential for FMA. This issue costs 4 points.4
[SA] Presence of a large number of scalar integer instructions - Simplify loop structure, perform loop splitting or perform unroll and jam. This issue costs 2 points.2
Control Flow Issues+2
[SA] Several paths (2 paths) - Simplify control structure or force the compiler to use masked instructions. There are 2 issues ( = paths) costing 1 point each.2
Data Access Issues+12
[SA] Presence of indirect accesses - Use array restructuring or gather instructions to lower the cost. There are 3 issues ( = indirect data accesses) costing 4 point each.12
Vectorization Roadblocks+14
[SA] Several paths (2 paths) - Simplify control structure or force the compiler to use masked instructions. There are 2 issues ( = paths) costing 1 point each.2
[SA] Presence of indirect accesses - Use array restructuring or gather instructions to lower the cost. There are 3 issues ( = indirect data accesses) costing 4 point each.12
Loop 47726 - engine_linuxa64_ompi+Execution Time: 0 % - Vectorization Ratio: 30.77 % - Vector Length Use: 38.46 %
Loop Computation Issues+6
[SA] Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA - Reorganize arithmetic expressions to exhibit potential for FMA. This issue costs 4 points.4
[SA] Presence of a large number of scalar integer instructions - Simplify loop structure, perform loop splitting or perform unroll and jam. This issue costs 2 points.2
Data Access Issues+40
[SA] Presence of constant non unit stride data access - Use array restructuring, perform loop interchange or use gather instructions to lower a bit the cost. There are 18 issues ( = data accesses) costing 2 point each.36
[SA] Presence of indirect accesses - Use array restructuring or gather instructions to lower the cost. There are 1 issues ( = indirect data accesses) costing 4 point each.4
Vectorization Roadblocks+40
[SA] Presence of constant non unit stride data access - Use array restructuring, perform loop interchange or use gather instructions to lower a bit the cost. There are 18 issues ( = data accesses) costing 2 point each.36
[SA] Presence of indirect accesses - Use array restructuring or gather instructions to lower the cost. There are 1 issues ( = indirect data accesses) costing 4 point each.4
Loop 10055 - engine_linuxa64_ompi+Execution Time: 0 % - Vectorization Ratio: 100.00 % - Vector Length Use: 50.00 %
Loop Computation Issues+4
[SA] Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA - Reorganize arithmetic expressions to exhibit potential for FMA. This issue costs 4 points.4
Loop 36822 - engine_linuxa64_ompi+Execution Time: 0 % - Vectorization Ratio: 10.61 % - Vector Length Use: 24.77 %
Loop Computation Issues+6
[SA] Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA - Reorganize arithmetic expressions to exhibit potential for FMA. This issue costs 4 points.4
[SA] Presence of a large number of scalar integer instructions - Simplify loop structure, perform loop splitting or perform unroll and jam. This issue costs 2 points.2
Control Flow Issues+2
[SA] Several paths (2 paths) - Simplify control structure or force the compiler to use masked instructions. There are 2 issues ( = paths) costing 1 point each.2
Data Access Issues+16
[SA] Presence of indirect accesses - Use array restructuring or gather instructions to lower the cost. There are 4 issues ( = indirect data accesses) costing 4 point each.16
Vectorization Roadblocks+18
[SA] Several paths (2 paths) - Simplify control structure or force the compiler to use masked instructions. There are 2 issues ( = paths) costing 1 point each.2
[SA] Presence of indirect accesses - Use array restructuring or gather instructions to lower the cost. There are 4 issues ( = indirect data accesses) costing 4 point each.16
Loop 9851 - engine_linuxa64_ompi+Execution Time: 0 % - Vectorization Ratio: 0.00 % - Vector Length Use: 25.00 %
Loop Computation Issues+2
[SA] Presence of a large number of scalar integer instructions - Simplify loop structure, perform loop splitting or perform unroll and jam. This issue costs 2 points.2
Loop 39158 - engine_linuxa64_ompi+Execution Time: 0 % - Vectorization Ratio: 0.00 % - Vector Length Use: 22.32 %
Loop Computation Issues+6
[SA] Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA - Reorganize arithmetic expressions to exhibit potential for FMA. This issue costs 4 points.4
[SA] Presence of a large number of scalar integer instructions - Simplify loop structure, perform loop splitting or perform unroll and jam. This issue costs 2 points.2
Control Flow Issues+10
[SA] Too many paths (6 paths) - Simplify control structure. There are 6 issues ( = paths) costing 1 point each with a malus of 4 points.10
Vectorization Roadblocks+10
[SA] Too many paths (6 paths) - Simplify control structure. There are 6 issues ( = paths) costing 1 point each with a malus of 4 points.10
Loop 38957 - engine_linuxa64_ompi+Execution Time: 0 % - Vectorization Ratio: 0.00 % - Vector Length Use: 24.24 %
Loop Computation Issues+18
[SA] Presence of expensive FP instructions - Perform hoisting, change algorithm, use SVML or proper numerical library or perform value profiling (count the number of distinct input values). There are 3 issues (= instructions) costing 4 points each.12
[SA] Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA - Reorganize arithmetic expressions to exhibit potential for FMA. This issue costs 4 points.4
[SA] Presence of a large number of scalar integer instructions - Simplify loop structure, perform loop splitting or perform unroll and jam. This issue costs 2 points.2
Control Flow Issues+3
[SA] Several paths (3 paths) - Simplify control structure or force the compiler to use masked instructions. There are 3 issues ( = paths) costing 1 point each.3
Data Access Issues+10
[SA] Presence of constant non unit stride data access - Use array restructuring, perform loop interchange or use gather instructions to lower a bit the cost. There are 1 issues ( = data accesses) costing 2 point each.2
[SA] Presence of indirect accesses - Use array restructuring or gather instructions to lower the cost. There are 2 issues ( = indirect data accesses) costing 4 point each.8
Vectorization Roadblocks+13
[SA] Several paths (3 paths) - Simplify control structure or force the compiler to use masked instructions. There are 3 issues ( = paths) costing 1 point each.3
[SA] Presence of constant non unit stride data access - Use array restructuring, perform loop interchange or use gather instructions to lower a bit the cost. There are 1 issues ( = data accesses) costing 2 point each.2
[SA] Presence of indirect accesses - Use array restructuring or gather instructions to lower the cost. There are 2 issues ( = indirect data accesses) costing 4 point each.8
Loop 5674 - engine_linuxa64_ompi+Execution Time: 0 % - Vectorization Ratio: 93.33 % - Vector Length Use: 61.67 %
Loop Computation Issues+4
[SA] Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA - Reorganize arithmetic expressions to exhibit potential for FMA. This issue costs 4 points.4
Loop 39157 - engine_linuxa64_ompi+Execution Time: 0 % - Vectorization Ratio: 0.00 % - Vector Length Use: 23.69 %
Loop Computation Issues+10
[SA] Presence of expensive FP instructions - Perform hoisting, change algorithm, use SVML or proper numerical library or perform value profiling (count the number of distinct input values). There are 1 issues (= instructions) costing 4 points each.4
[SA] Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA - Reorganize arithmetic expressions to exhibit potential for FMA. This issue costs 4 points.4
[SA] Presence of a large number of scalar integer instructions - Simplify loop structure, perform loop splitting or perform unroll and jam. This issue costs 2 points.2
Control Flow Issues+3
[SA] Several paths (3 paths) - Simplify control structure or force the compiler to use masked instructions. There are 3 issues ( = paths) costing 1 point each.3
Vectorization Roadblocks+3
[SA] Several paths (3 paths) - Simplify control structure or force the compiler to use masked instructions. There are 3 issues ( = paths) costing 1 point each.3
Loop 9890 - engine_linuxa64_ompi+Execution Time: 0 % - Vectorization Ratio: 97.37 % - Vector Length Use: 49.34 %
Loop Computation Issues+6
[SA] Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA - Reorganize arithmetic expressions to exhibit potential for FMA. This issue costs 4 points.4
[SA] Presence of a large number of scalar integer instructions - Simplify loop structure, perform loop splitting or perform unroll and jam. This issue costs 2 points.2
Data Access Issues+60
[SA] Presence of constant non unit stride data access - Use array restructuring, perform loop interchange or use gather instructions to lower a bit the cost. There are 30 issues ( = data accesses) costing 2 point each.60
Vectorization Roadblocks+60
[SA] Presence of constant non unit stride data access - Use array restructuring, perform loop interchange or use gather instructions to lower a bit the cost. There are 30 issues ( = data accesses) costing 2 point each.60
Loop 38992 - engine_linuxa64_ompi+Execution Time: 0 % - Vectorization Ratio: 96.77 % - Vector Length Use: 49.19 %
Loop Computation Issues+4
[SA] Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA - Reorganize arithmetic expressions to exhibit potential for FMA. This issue costs 4 points.4
Loop 7060 - engine_linuxa64_ompi+Execution Time: 0 % - Vectorization Ratio: 0.00 % - Vector Length Use: 23.76 %
Loop Computation Issues+6
[SA] Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA - Reorganize arithmetic expressions to exhibit potential for FMA. This issue costs 4 points.4
[SA] Presence of a large number of scalar integer instructions - Simplify loop structure, perform loop splitting or perform unroll and jam. This issue costs 2 points.2
Control Flow Issues+3
[SA] Several paths (3 paths) - Simplify control structure or force the compiler to use masked instructions. There are 3 issues ( = paths) costing 1 point each.3
Data Access Issues+16
[SA] Presence of indirect accesses - Use array restructuring or gather instructions to lower the cost. There are 4 issues ( = indirect data accesses) costing 4 point each.16
Vectorization Roadblocks+19
[SA] Several paths (3 paths) - Simplify control structure or force the compiler to use masked instructions. There are 3 issues ( = paths) costing 1 point each.3
[SA] Presence of indirect accesses - Use array restructuring or gather instructions to lower the cost. There are 4 issues ( = indirect data accesses) costing 4 point each.16
×