Loop id | Source Location | Source Function | Level | Coverage run_0 (%) | Max Time Over Threads run_0 (s) | Time w.r.t. Wall Time run_0 (s) | Nb Threads run_0 | Vectorization Ratio (%) | Vectorization Efficiency (%) | Speedup If No Scalar Integer | Speedup If FP Vectorized | Speedup If Fully Vectorized | Speedup If Perfect Load Balancing run_0 | Stride 0 | Stride 1 | Stride n | Stride Unknown | Stride Indirect | Speedup If Data in L1 run_0 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
4 | convf32_avx512 - codelet.c:91-116 | vec_reg_block_conv_bp | Innermost | 96.54 | 10.32 | 10.32 | 1 | 94.44 | 94.79 | 1.31 | 1 | 1.02 | 1 | 1 | 0 | 3 | 3 | 4 | 1.24 |
3 | convf32_avx512 - codelet.c:88-116 | vec_reg_block_conv_bp | InBetween | 2.85 | 0.3 | 0.31 | 1 | 34.78 | 42.66 | 1.81 | 1 | 1.18 | 1 | 3 | 0 | 3.5 | 0.5 | 0 | NA |
2 | convf32_avx512 - codelet.c:86-116 | vec_reg_block_conv_bp | InBetween | 0.51 | 0.05 | 0.05 | 1 | 0 | 10.16 | 1 | 1 | 9.14 | 1 | 5 | 0 | 5 | 3 | 0 | NA |
1 | convf32_avx512 - codelet.c:85-116 | vec_reg_block_conv_bp | InBetween | 0.05 | 0 | 0 | 1 | 0 | 7.59 | 1 | 1 | 14 | 0 | 3.67 | 0 | 4.67 | 0.67 | 0 | NA |