Run OMP1 | Number processes: 6Number nodes: 1Number processes per node: 6Run Command: <executable>MPI Command: mpirun -n <number_processes>Dataset: Run Directory: .OMP_PROC_BIND: closeOMP_PLACES: coresOMP_NUM_THREADS: 1 |
---|---|
Run OMP2 | Number processes: 6Number nodes: 1Number processes per node: 6Run Command: <executable>MPI Command: mpirun -n <number_processes>Dataset: Run Directory: .OMP_NUM_THREADS: 2OMP_PROC_BIND: closeOMP_PLACES: cores |
Run OMP4 | Number processes: 6Number nodes: 1Number processes per node: 6Run Command: <executable>MPI Command: mpirun -n <number_processes>Dataset: Run Directory: .OMP_NUM_THREADS: 4OMP_PROC_BIND: closeOMP_PLACES: cores |
Run OMP8 | Number processes: 6Number nodes: 1Number processes per node: 6Run Command: <executable>MPI Command: mpirun -n <number_processes>Dataset: Run Directory: .OMP_NUM_THREADS: 8OMP_PROC_BIND: closeOMP_PLACES: cores |
Run OMP16 | Number processes: 6Number nodes: 1Number processes per node: 6Run Command: <executable>MPI Command: mpirun -n <number_processes>Dataset: Run Directory: .OMP_NUM_THREADS: 16OMP_PROC_BIND: closeOMP_PLACES: cores |
Run OMP32 | Number processes: 6Number nodes: 1Number processes per node: 6Run Command: <executable>MPI Command: mpirun -n <number_processes>Dataset: Run Directory: .OMP_NUM_THREADS: 32OMP_PROC_BIND: closeOMP_PLACES: cores |
Loop id | Source Location | Source Function | Level | Exclusive Coverage OMP1 (%) | Exclusive Coverage OMP2 (%) | Exclusive Coverage OMP4 (%) | Exclusive Coverage OMP8 (%) | Exclusive Coverage OMP16 (%) | Exclusive Coverage OMP32 (%) | Inclusive Coverage OMP1 (%) | Inclusive Coverage OMP2 (%) | Inclusive Coverage OMP4 (%) | Inclusive Coverage OMP8 (%) | Inclusive Coverage OMP16 (%) | Inclusive Coverage OMP32 (%) | Max Exclusive Time Over Threads OMP1 (s) | Max Exclusive Time Over Threads OMP2 (s) | Max Exclusive Time Over Threads OMP4 (s) | Max Exclusive Time Over Threads OMP8 (s) | Max Exclusive Time Over Threads OMP16 (s) | Max Exclusive Time Over Threads OMP32 (s) | Max Inclusive Time Over Threads OMP1 (s) | Max Inclusive Time Over Threads OMP2 (s) | Max Inclusive Time Over Threads OMP4 (s) | Max Inclusive Time Over Threads OMP8 (s) | Max Inclusive Time Over Threads OMP16 (s) | Max Inclusive Time Over Threads OMP32 (s) | Exclusive Time w.r.t. Wall Time OMP1 (s) | Exclusive Time w.r.t. Wall Time OMP2 (s) | Exclusive Time w.r.t. Wall Time OMP4 (s) | Exclusive Time w.r.t. Wall Time OMP8 (s) | Exclusive Time w.r.t. Wall Time OMP16 (s) | Exclusive Time w.r.t. Wall Time OMP32 (s) | Inclusive Time w.r.t. Wall Time OMP1 (s) | Inclusive Time w.r.t. Wall Time OMP2 (s) | Inclusive Time w.r.t. Wall Time OMP4 (s) | Inclusive Time w.r.t. Wall Time OMP8 (s) | Inclusive Time w.r.t. Wall Time OMP16 (s) | Inclusive Time w.r.t. Wall Time OMP32 (s) | Nb Threads OMP1 | Nb Threads OMP2 | Nb Threads OMP4 | Nb Threads OMP8 | Nb Threads OMP16 | Nb Threads OMP32 | GFLOPS OMP1 | GFLOPS OMP2 | GFLOPS OMP4 | GFLOPS OMP8 | GFLOPS OMP16 | GFLOPS OMP32 | Vectorization Ratio (%) | Vector Length Use (%) | Speedup If No Scalar Integer | Speedup If FP Vectorized | Speedup If Fully Vectorized | Speedup If Perfect Load Balancing OMP1 | Speedup If Perfect Load Balancing OMP2 | Speedup If Perfect Load Balancing OMP4 | Speedup If Perfect Load Balancing OMP8 | Speedup If Perfect Load Balancing OMP16 | Speedup If Perfect Load Balancing OMP32 | Stride 0 | Stride 1 | Stride n | Stride Unknown | Stride Indirect | Array Access Efficiency | (OMP1) Efficiency | (OMP1) Potential Speed-Up (%) | (OMP2) Efficiency | (OMP2) Potential Speed-Up (%) | (OMP4) Efficiency | (OMP4) Potential Speed-Up (%) | (OMP8) Efficiency | (OMP8) Potential Speed-Up (%) | (OMP16) Efficiency | (OMP16) Potential Speed-Up (%) | (OMP32) Efficiency | (OMP32) Potential Speed-Up (%) |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
11339 | xhpl - | mkl_blas_avx512_dgemm_kernel_0 | Innermost | 55.65 | 53.08 | 49.74 | 44.12 | 37.14 | 31.85 | 55.65 | 53.08 | 49.74 | 44.12 | 37.14 | 31.85 | 609.27 | 310.21 | 159.05 | 83.79 | 47.13 | 31.72 | 609.27 | 310.21 | 159.05 | 83.79 | 47.13 | 31.72 | 607.22 | 329.96 | 186.25 | 111.91 | 72.22 | 54.44 | 607.22 | 329.96 | 186.25 | 111.91 | 72.22 | 54.44 | 6 | 12 | 24 | 48 | 96 | 192 | 722.86 | 1330.27 | 2356.64 | 3919.00 | 6072.79 | 8032.87 | 77.14 | 80 | 1 | 1 | 1 | 1 | 1.02 | 1.02 | 1.02 | 1.03 | 1.03 | 0 | 0 | 0 | 3 | 0 | 50.00 | 1 | 0 | 0.92 | 4.24 | 0.82 | 9.2 | 0.68 | 14.19 | 0.53 | 17.62 | 0.35 | 20.75 |
11337 | xhpl - | mkl_blas_avx512_dgemm_kernel_0 | Innermost | 21.02 | 20.11 | 18.86 | 16.70 | 14.01 | 12.06 | 21.02 | 20.11 | 18.86 | 16.70 | 14.01 | 12.06 | 232.34 | 118.64 | 61.05 | 32.04 | 17.98 | 12.39 | 232.34 | 118.64 | 61.05 | 32.04 | 17.98 | 12.39 | 229.37 | 125.01 | 70.62 | 42.35 | 27.25 | 20.60 | 229.37 | 125.01 | 70.62 | 42.35 | 27.25 | 20.60 | 6 | 12 | 24 | 48 | 96 | 192 | 722.86 | 1331.98 | 2354.62 | 3918.21 | 6088.84 | 8022.63 | 77.14 | 80 | 1 | 1 | 1 | 1.01 | 1.03 | 1.03 | 1.03 | 1.04 | 1.07 | 0 | 0 | 0 | 3 | 0 | 50.00 | 1 | 0 | 0.92 | 1.66 | 0.81 | 3.55 | 0.68 | 5.39 | 0.53 | 6.64 | 0.35 | 7.86 |
11338 | xhpl - | mkl_blas_avx512_dgemm_kernel_0 | Innermost | 7.99 | 7.30 | 6.80 | 6.05 | 5.01 | 4.35 | 7.99 | 7.30 | 6.80 | 6.05 | 5.01 | 4.35 | 88.47 | 44.04 | 22.68 | 12.20 | 6.62 | 4.55 | 88.47 | 44.04 | 22.68 | 12.20 | 6.62 | 4.55 | 87.20 | 45.38 | 25.48 | 15.35 | 9.75 | 7.43 | 87.20 | 45.38 | 25.48 | 15.35 | 9.75 | 7.43 | 6 | 12 | 24 | 48 | 96 | 192 | 636.03 | 1220.36 | 2172.69 | 3601.09 | 5664.80 | 7406.33 | 77.14 | 80 | 1 | 1 | 1 | 1.02 | 1.05 | 1.06 | 1.08 | 1.07 | 1.09 | 0 | 0 | 0 | 4 | 0 | 50.00 | 1 | 0 | 0.96 | 0.29 | 0.86 | 0.98 | 0.71 | 1.75 | 0.56 | 2.21 | 0.37 | 2.75 |
404 | xhpl - HPL_dlaswp04N.c:180-226 | HPL_dlaswp04N | Innermost | 1.26 | 1.18 | 1.02 | 0.86 | 0.64 | 0.41 | 1.26 | 1.18 | 1.02 | 0.86 | 0.64 | 0.41 | 15.04 | 15.14 | 13.98 | 13.76 | 13.34 | 13.80 | 15.04 | 15.14 | 13.98 | 13.76 | 13.34 | 13.80 | 13.75 | 7.31 | 3.84 | 2.17 | 1.24 | 0.70 | 13.75 | 7.31 | 3.84 | 2.17 | 1.24 | 0.70 | 6 | 6 | 6 | 6 | 6 | 6 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0 | 12.5 | 1.02 | 1 | 8 | 1.1 | 1.12 | 1.09 | 1.08 | 1.06 | 1.09 | 0 | 2 | 0 | 63 | 2 | 50.00 | 1 | 0 | 0.94 | 0.07 | 0.9 | 0.11 | 0.79 | 0.18 | 0.69 | 0.2 | 0.61 | 0.16 |
11335 | xhpl - | mkl_blas_avx512_dgemm_kernel_0 | InBetween | 0.78 | 0.76 | 0.72 | 0.64 | 0.52 | 0.43 | 85.43 | 81.25 | 76.12 | 67.51 | 56.69 | 48.69 | 9.18 | 4.91 | 2.70 | 1.48 | 0.84 | 0.60 | 935.48 | 472.71 | 242.45 | 127.57 | 70.96 | 47.63 | 8.49 | 4.73 | 2.69 | 1.62 | 1.02 | 0.73 | 932.28 | 505.08 | 285.04 | 171.23 | 110.23 | 83.21 | 6 | 12 | 24 | 48 | 96 | 192 | 501.81 | 908.65 | 1601.24 | 2652.51 | 4228.40 | 5851.85 | 96.3 | 96.76 | 1 | 1 | 1 | 1.08 | 1.12 | 1.2 | 1.24 | 1.31 | 1.47 | NA | NA | NA | NA | NA | 0.00 | 1 | 0 | 0.9 | 0.08 | 0.79 | 0.15 | 0.65 | 0.22 | 0.52 | 0.25 | 0.36 | 0.27 |
5992 | xhpl - | mkl_blas_avx512_dgemm_dcopy_right8_ea | Innermost | 0.71 | 0.28 | 0.15 | 0.09 | 0.06 | 0.06 | 0.71 | 0.28 | 0.15 | 0.09 | 0.06 | 0.06 | 8.16 | 1.81 | 0.58 | 0.23 | 0.12 | 0.11 | 8.16 | 1.81 | 0.58 | 0.23 | 0.12 | 0.11 | 7.70 | 1.74 | 0.54 | 0.22 | 0.12 | 0.11 | 7.70 | 1.74 | 0.54 | 0.22 | 0.12 | 0.11 | 6 | 12 | 24 | 48 | 96 | 192 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100 | 57.14 | 1 | 1 | 2 | 1.06 | 1.13 | 1.26 | 1.46 | 1.55 | 1.93 | 0 | 3 | 0 | 0 | 0 | 100.00 | 1 | 0 | 2.22 | 0 | 3.54 | 0 | 4.47 | 0 | 3.94 | 0 | 2.28 | 0 |
347 | xhpl - HPL_dlaswp01N.c:160-191 | HPL_dlaswp01N | Innermost | 0.55 | 0.52 | 0.48 | 0.40 | 0.31 | 0.20 | 0.55 | 0.52 | 0.48 | 0.40 | 0.31 | 0.20 | 6.35 | 6.38 | 6.35 | 6.37 | 6.39 | 6.43 | 6.35 | 6.38 | 6.35 | 6.37 | 6.39 | 6.43 | 6.00 | 3.24 | 1.80 | 1.02 | 0.60 | 0.34 | 6.00 | 3.24 | 1.80 | 1.02 | 0.60 | 0.34 | 6 | 6 | 6 | 6 | 6 | 6 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0 | 12.41 | 1.94 | 1 | 11.83 | 1.06 | 1.06 | 1.05 | 1.06 | 1.06 | 1.06 | 2 | 2 | 0 | 1 | 1 | 75.00 | 1 | 0 | 0.92 | 0.04 | 0.83 | 0.08 | 0.73 | 0.11 | 0.63 | 0.11 | 0.56 | 0.09 |
10487 | xhpl - | mkl_blas_avx512_dtrsm_kernel_ll_0 | Innermost | 0.47 | 0.88 | 1.02 | 0.94 | 0.75 | 0.52 | 0.47 | 0.88 | 1.02 | 0.94 | 0.75 | 0.52 | 5.21 | 7.77 | 4.05 | 2.00 | 1.10 | 0.60 | 5.21 | 7.77 | 4.05 | 2.00 | 1.10 | 0.60 | 5.15 | 5.46 | 3.81 | 2.39 | 1.46 | 0.88 | 5.15 | 5.46 | 3.81 | 2.39 | 1.46 | 0.88 | 6 | 12 | 24 | 48 | 96 | 192 | 690.17 | 649.80 | 931.54 | 1485.16 | 2432.71 | 4017.76 | 77.14 | 80 | 1 | 1 | 1 | 1.01 | 1.54 | 1.27 | 1.14 | 1.19 | 1.2 | 0 | 0 | 0 | 3 | 0 | 50.00 | 1 | 0 | 0.47 | 0.46 | 0.34 | 0.67 | 0.27 | 0.69 | 0.22 | 0.59 | 0.18 | 0.42 |
369 | xhpl - HPL_dlaswp02N.c:160-189 | HPL_dlaswp02N | Innermost | 0.33 | 0.31 | 0.29 | 0.24 | 0.18 | 0.12 | 0.33 | 0.31 | 0.29 | 0.24 | 0.18 | 0.12 | 3.71 | 3.69 | 3.71 | 3.68 | 3.66 | 3.72 | 3.71 | 3.69 | 3.71 | 3.68 | 3.66 | 3.72 | 3.64 | 1.96 | 1.08 | 0.62 | 0.36 | 0.20 | 3.64 | 1.96 | 1.08 | 0.62 | 0.36 | 0.20 | 6 | 6 | 6 | 6 | 6 | 6 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0 | 12.5 | 1.28 | 1 | 8 | 1.02 | 1.02 | 1.02 | 1.02 | 1.01 | 1.03 | 2 | 1 | 0 | 2 | 0 | 80.00 | 1 | 0 | 0.93 | 0.02 | 0.84 | 0.05 | 0.74 | 0.06 | 0.64 | 0.07 | 0.57 | 0.05 |
10485 | xhpl - | mkl_blas_avx512_dtrsm_kernel_ll_0 | InBetween | 0.24 | 0.29 | 0.30 | 0.27 | 0.21 | 0.15 | 0.71 | 1.17 | 1.32 | 1.22 | 0.96 | 0.67 | 2.79 | 2.31 | 1.13 | 0.64 | 0.34 | 0.21 | 7.86 | 10.00 | 5.12 | 2.55 | 1.36 | 0.74 | 2.59 | 1.81 | 1.13 | 0.69 | 0.42 | 0.26 | 7.73 | 7.28 | 4.95 | 3.08 | 1.88 | 1.15 | 6 | 12 | 24 | 48 | 96 | 192 | 111.64 | 160.20 | 253.98 | 419.81 | 690.83 | 1105.30 | 99.45 | 86.41 | 1 | 1 | 1.17 | 1.08 | 1.38 | 1.19 | 1.26 | 1.27 | 1.42 | 1.33 | 0 | 0.67 | 1.33 | 0 | 75.00 | 1 | 0 | 0.71 | 0.08 | 0.57 | 0.13 | 0.47 | 0.15 | 0.39 | 0.13 | 0.31 | 0.11 |
6042 | xhpl - | mkl_blas_avx512_dgemm_dcopy_down24_ea | Innermost | 0.22 | 0.38 | 0.38 | 0.42 | 0.43 | 0.47 | 0.22 | 0.38 | 0.38 | 0.42 | 0.43 | 0.47 | 2.70 | 2.41 | 1.39 | 1.03 | 0.73 | 0.73 | 2.70 | 2.41 | 1.39 | 1.03 | 0.73 | 0.73 | 2.36 | 2.37 | 1.40 | 1.07 | 0.84 | 0.80 | 2.36 | 2.37 | 1.40 | 1.07 | 0.84 | 0.80 | 6 | 12 | 24 | 48 | 96 | 192 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100 | 100 | 1.04 | 1 | 1 | 1.15 | 1.1 | 1.18 | 1.32 | 1.38 | 1.6 | 0 | 1 | 0 | 2 | 0 | 66.67 | 1 | 0 | 0.5 | 0.19 | 0.42 | 0.22 | 0.28 | 0.3 | 0.18 | 0.36 | 0.09 | 0.43 |
5629 | xhpl - | mkl_blas_avx512_dgemm_kernel_nocopy_NN_b1 | Innermost | 0.11 | 0.04 | 0.06 | 0.08 | 0.07 | 0.06 | 0.11 | 0.04 | 0.06 | 0.08 | 0.07 | 0.06 | 1.30 | 0.48 | 0.50 | 0.44 | 0.27 | 0.16 | 1.30 | 0.48 | 0.50 | 0.44 | 0.27 | 0.16 | 1.25 | 0.22 | 0.23 | 0.21 | 0.14 | 0.10 | 1.25 | 0.22 | 0.23 | 0.21 | 0.14 | 0.10 | 6 | 12 | 24 | 48 | 96 | 192 | 547.57 | 859.32 | 1535.70 | 2453.18 | 3688.77 | 5167.18 | 77.14 | 80 | 1 | 1 | 1 | 1.04 | 2.36 | 2.57 | 2.9 | 3.03 | 2.93 | 0 | 0 | 0 | 10 | 0 | 50.00 | 1 | 0 | 2.81 | 0 | 1.36 | 0 | 0.75 | 0.02 | 0.56 | 0.03 | 0.39 | 0.04 |
12945 | xhpl - | __intel_avx_rep_memcpy | Single | 0.10 | 0.10 | 0.09 | 0.07 | 0.05 | 0.04 | 0.10 | 0.10 | 0.09 | 0.07 | 0.05 | 0.04 | 1.20 | 1.18 | 1.13 | 1.11 | 1.11 | 1.15 | 1.20 | 1.18 | 1.13 | 1.11 | 1.11 | 1.15 | 1.12 | 0.60 | 0.32 | 0.18 | 0.11 | 0.06 | 1.12 | 0.60 | 0.32 | 0.18 | 0.11 | 0.06 | 6 | 6 | 6 | 6 | 6 | 6 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 100 | 50 | 1 | 1 | 2 | 1.07 | 1.06 | 1.04 | 1.03 | 1.04 | 1.02 | 0 | 2 | 0 | 0 | 0 | 100.00 | 1 | 0 | 0.93 | 0.01 | 0.87 | 0.01 | 0.76 | 0.02 | 0.66 | 0.02 | 0.56 | 0.02 |
6286 | xhpl - | mkl_blas_avx512_dgemv_n_intrinsics | Innermost | 0.09 | 0.10 | 0.11 | 0.10 | 0.11 | 0.12 | 0.09 | 0.10 | 0.11 | 0.10 | 0.11 | 0.12 | 1.04 | 0.67 | 0.37 | 0.22 | 0.14 | 0.13 | 1.04 | 0.67 | 0.37 | 0.22 | 0.14 | 0.13 | 1.03 | 0.64 | 0.39 | 0.26 | 0.20 | 0.20 | 1.03 | 0.64 | 0.39 | 0.26 | 0.20 | 0.20 | 6 | 12 | 24 | 48 | 96 | 192 | 29.15 | 46.88 | 75.95 | 114.84 | 145.01 | 144.67 | 100 | 100 | 1 | 1 | 1 | 1.02 | 1.13 | 1.12 | 1.13 | 1.12 | 1.13 | 0 | 2 | 0 | 0 | 0 | 100.00 | 1 | 0 | 0.8 | 0.02 | 0.65 | 0.04 | 0.49 | 0.05 | 0.31 | 0.07 | 0.16 | 0.1 |
381 | xhpl - HPL_dlaswp03N.c:147-177 | HPL_dlaswp03N | Innermost | 0.09 | 0.08 | 0.08 | 0.06 | 0.05 | 0.03 | 0.09 | 0.08 | 0.08 | 0.06 | 0.05 | 0.03 | 0.96 | 0.95 | 0.97 | 0.95 | 0.98 | 0.99 | 0.96 | 0.95 | 0.97 | 0.95 | 0.98 | 0.99 | 0.94 | 0.50 | 0.28 | 0.16 | 0.09 | 0.05 | 0.94 | 0.50 | 0.28 | 0.16 | 0.09 | 0.05 | 6 | 6 | 6 | 6 | 6 | 6 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0 | 12.5 | 1.03 | 1 | 8 | 1.03 | 1.02 | 1.02 | 1.02 | 1.05 | 1.03 | 0 | 0 | 0 | 33 | 1 | 48.53 | 1 | 0 | 0.93 | 0.01 | 0.83 | 0.01 | 0.74 | 0.02 | 0.63 | 0.02 | 0.55 | 0.01 |
311 | xhpl - HPL_pdgesv0.c:149-150 | HPL_pdgesv0 | Innermost | 0.07 | 0.07 | 0.05 | 0.04 | 0.03 | 0.02 | 0.07 | 0.07 | 0.05 | 0.04 | 0.03 | 0.02 | 0.85 | 0.98 | 0.75 | 0.67 | 0.74 | 0.72 | 0.85 | 0.98 | 0.75 | 0.67 | 0.74 | 0.72 | 0.82 | 0.44 | 0.19 | 0.11 | 0.06 | 0.03 | 0.82 | 0.44 | 0.19 | 0.11 | 0.06 | 0.03 | 6 | 6 | 6 | 6 | 6 | 6 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0 | 12.5 | 1 | 1 | 8 | 1.05 | 1.22 | 1.16 | 1.1 | 1.2 | 1.16 | 2 | 0 | 0 | 0 | 0 | 100.00 | 1 | 0 | 0.94 | 0 | 1.06 | -0 | 0.97 | 0 | 0.84 | 0 | 0.74 | 0.01 |
104 | xhpl - HPL_pdlange.c:173-173 | HPL_pdlange | Innermost | 0.07 | 0.06 | 0.06 | 0.05 | 0.04 | 0.02 | 0.07 | 0.06 | 0.06 | 0.05 | 0.04 | 0.02 | 0.73 | 0.74 | 0.74 | 0.74 | 0.77 | 0.74 | 0.73 | 0.74 | 0.74 | 0.74 | 0.77 | 0.74 | 0.73 | 0.39 | 0.22 | 0.13 | 0.07 | 0.04 | 0.73 | 0.39 | 0.22 | 0.13 | 0.07 | 0.04 | 6 | 6 | 6 | 6 | 6 | 6 | 27.54 | 50.70 | 91.84 | 159.82 | 272.03 | 493.11 | 100 | 100 | 1 | 1 | 1 | 1.01 | 1.01 | 1.01 | 1.01 | 1.04 | 1.01 | 0 | 1 | 0 | 0 | 0 | 100.00 | 1 | 0 | 0.92 | 0.01 | 0.83 | 0.01 | 0.72 | 0.01 | 0.62 | 0.01 | 0.56 | 0.01 |
102 | xhpl - HPL_pdlange.c:210-211 | HPL_pdlange | Innermost | 0.07 | 0.06 | 0.06 | 0.05 | 0.04 | 0.02 | 0.07 | 0.06 | 0.06 | 0.05 | 0.04 | 0.02 | 0.75 | 0.75 | 0.75 | 0.75 | 0.75 | 0.75 | 0.75 | 0.75 | 0.75 | 0.75 | 0.75 | 0.75 | 0.72 | 0.39 | 0.22 | 0.12 | 0.07 | 0.04 | 0.72 | 0.39 | 0.22 | 0.12 | 0.07 | 0.04 | 6 | 6 | 6 | 6 | 6 | 6 | 27.68 | 51.33 | 92.71 | 161.87 | 278.23 | 498.15 | 100 | 100 | 1 | 1 | 1 | 1.04 | 1.03 | 1.04 | 1.04 | 1.03 | 1.03 | 0 | 3 | 0 | 0 | 0 | 100.00 | 1 | 0 | 0.93 | 0 | 0.84 | 0.01 | 0.73 | 0.01 | 0.63 | 0.01 | 0.56 | 0.01 |
5627 | xhpl - | mkl_blas_avx512_dgemm_kernel_nocopy_NN_b1 | InBetween | 0.05 | 0.02 | 0.03 | 0.03 | 0.03 | 0.03 | 0.19 | 0.06 | 0.10 | 0.17 | 0.14 | 0.13 | 0.55 | 0.31 | 0.19 | 0.21 | 0.11 | 0.08 | 2.19 | 0.76 | 0.74 | 0.80 | 0.48 | 0.32 | 0.52 | 0.15 | 0.10 | 0.08 | 0.06 | 0.05 | 2.06 | 0.40 | 0.37 | 0.42 | 0.28 | 0.21 | 6 | 12 | 24 | 48 | 96 | 192 | 93.34 | 152.93 | 290.36 | 410.81 | 580.29 | 616.41 | 97.56 | 97.87 | 1 | 1 | 1 | 1.06 | 2.3 | 2.32 | 3.43 | 2.93 | 2.65 | NA | NA | NA | NA | NA | 0.00 | 1 | 0 | 1.78 | 0 | 1.33 | 0 | 0.78 | 0.01 | 0.57 | 0.01 | 0.3 | 0.02 |
5552 | xhpl - | mkl_blas_avx512_dgemm_kernel_nocopy_NN_b1 | Outermost | 0.01 | 0.02 | 0.02 | 0.02 | 0.02 | 0.02 | 0.21 | 0.10 | 0.13 | 0.20 | 0.19 | 0.18 | 0.13 | 0.13 | 0.09 | 0.08 | 0.06 | 0.05 | 2.45 | 0.99 | 0.84 | 0.88 | 0.57 | 0.37 | 0.12 | 0.11 | 0.06 | 0.04 | 0.03 | 0.03 | 2.33 | 0.60 | 0.48 | 0.51 | 0.37 | 0.31 | 6 | 12 | 24 | 48 | 96 | 192 | 216.80 | 334.98 | 610.38 | 743.61 | 849.63 | 909.68 | 96.22 | 96.7 | 1 | 1 | 1 | 1.12 | 1.28 | 1.9 | 2.49 | 3.26 | 3.08 | NA | NA | NA | NA | NA | 0.00 | 1 | 0 | 0.55 | 0.01 | 0.54 | 0.01 | 0.37 | 0.01 | 0.24 | 0.01 | 0.12 | 0.02 |