Loop id | Source Location | Source Function | Level | Coverage run_0 (%) | Max Time Over Threads run_0 (s) | Time w.r.t. Wall Time run_0 (s) | Nb Threads run_0 | Vectorization Ratio (%) | Vectorization Efficiency (%) | Speedup If No Scalar Integer | Speedup If FP Vectorized | Speedup If Fully Vectorized | Speedup If Perfect Load Balancing run_0 | Stride 0 | Stride 1 | Stride n | Stride Unknown | Stride Indirect | Speedup If Data in L1 run_0 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
4 | convf32_avx512 - codelet.c:86-95 [...] | optimized_conv_bp | Innermost | 98.37 | 10.88 | 10.88 | 1 | 92.97 | 93.51 | 1.36 | 1 | 1.02 | 1 | 1 | 0 | 3 | 2 | 1 | 1.16 |
3 | convf32_avx512 - codelet.c:85-96 [...] | optimized_conv_bp | InBetween | 1.36 | 0.15 | 0.15 | 1 | 34.78 | 40.17 | 1.79 | 1 | 1.2 | 1 | 1.5 | 0 | 0.5 | 3 | 0 | NA |
2 | convf32_avx512 - codelet.c:81-96 [...] | optimized_conv_bp | InBetween | 0.23 | 0.02 | 0.02 | 1 | 0 | 7.81 | 1 | 1 | 11.43 | 1 | 2 | 0 | 4 | 3 | 0 | NA |
6 | convf32_avx512 - driver.c:354-355 | main | Single | 0.05 | 0 | 0 | 1 | 0 | 10.94 | 1 | 1 | 16 | 0 | 1 | 0 | 0 | 0 | 0 | NA |