Run 2x1 | Number processes: 1Number nodes: 1Run Command: <executable> -x 100 -y 100 -z 100 --xproc=2 --yproc=1 --zproc=1MPI Command: mpirun -np 2Dataset: Run Directory: /beegfs/hackathon/users/eoseret/qaas_runs/170-850-7424/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_run_1708509234I_MPI_PIN_DOMAIN: auto:scatterOMP_PLACES: threadsOMP_PROC_BIND: spreadOMP_NUM_THREADS: 1 |
---|---|
Run 2x2 | OMP_NUM_THREADS: 2I_MPI_PIN_DOMAIN: auto:scatterOMP_PLACES: threadsOMP_PROC_BIND: spread |
Run 2x4 | OMP_NUM_THREADS: 4I_MPI_PIN_DOMAIN: auto:scatterOMP_PLACES: threadsOMP_PROC_BIND: spread |
Run 2x8 | OMP_NUM_THREADS: 8I_MPI_PIN_DOMAIN: auto:scatterOMP_PLACES: threadsOMP_PROC_BIND: spread |
Run 2x16 | OMP_NUM_THREADS: 16I_MPI_PIN_DOMAIN: auto:scatterOMP_PLACES: threadsOMP_PROC_BIND: spread |
Run 2x32 | OMP_NUM_THREADS: 32I_MPI_PIN_DOMAIN: auto:scatterOMP_PLACES: threadsOMP_PROC_BIND: spread |
Run 2x64 | OMP_NUM_THREADS: 64I_MPI_PIN_DOMAIN: auto:scatterOMP_PLACES: threadsOMP_PROC_BIND: spread |
Run 2x96 | OMP_NUM_THREADS: 96I_MPI_PIN_DOMAIN: auto:scatterOMP_PLACES: threadsOMP_PROC_BIND: spread |
Loop id | Source Location | Source Function | Level | Coverage 2x1 (%) | Coverage 2x2 (%) | Coverage 2x4 (%) | Coverage 2x8 (%) | Coverage 2x16 (%) | Coverage 2x32 (%) | Coverage 2x64 (%) | Coverage 2x96 (%) | Max Time Over Threads 2x1 (s) | Max Time Over Threads 2x2 (s) | Max Time Over Threads 2x4 (s) | Max Time Over Threads 2x8 (s) | Max Time Over Threads 2x16 (s) | Max Time Over Threads 2x32 (s) | Max Time Over Threads 2x64 (s) | Max Time Over Threads 2x96 (s) | Time w.r.t. Wall Time 2x1 (s) | Time w.r.t. Wall Time 2x2 (s) | Time w.r.t. Wall Time 2x4 (s) | Time w.r.t. Wall Time 2x8 (s) | Time w.r.t. Wall Time 2x16 (s) | Time w.r.t. Wall Time 2x32 (s) | Time w.r.t. Wall Time 2x64 (s) | Time w.r.t. Wall Time 2x96 (s) | Nb Threads 2x1 | Nb Threads 2x2 | Nb Threads 2x4 | Nb Threads 2x8 | Nb Threads 2x16 | Nb Threads 2x32 | Nb Threads 2x64 | Nb Threads 2x96 | GFLOPS 2x1 | GFLOPS 2x2 | GFLOPS 2x4 | GFLOPS 2x8 | GFLOPS 2x16 | GFLOPS 2x32 | GFLOPS 2x64 | GFLOPS 2x96 | Vectorization Ratio (%) | Vector Length Use (%) | Speedup If No Scalar Integer | Speedup If FP Vectorized | Speedup If Fully Vectorized | Speedup If Perfect Load Balancing 2x1 | Speedup If Perfect Load Balancing 2x2 | Speedup If Perfect Load Balancing 2x4 | Speedup If Perfect Load Balancing 2x8 | Speedup If Perfect Load Balancing 2x16 | Speedup If Perfect Load Balancing 2x32 | Speedup If Perfect Load Balancing 2x64 | Speedup If Perfect Load Balancing 2x96 | Stride 0 | Stride 1 | Stride n | Stride Unknown | Stride Indirect | (2x1) Efficiency | (2x1) Potential Speed-Up (%) | (2x2) Efficiency | (2x2) Potential Speed-Up (%) | (2x4) Efficiency | (2x4) Potential Speed-Up (%) | (2x8) Efficiency | (2x8) Potential Speed-Up (%) | (2x16) Efficiency | (2x16) Potential Speed-Up (%) | (2x32) Efficiency | (2x32) Potential Speed-Up (%) | (2x64) Efficiency | (2x64) Potential Speed-Up (%) | (2x96) Efficiency | (2x96) Potential Speed-Up (%) |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
145 | exec - ljForce.c:191-216 [...] | ljForce.extracted | Innermost | 93.14 | 90.16 | 85.25 | 78.92 | 68.12 | 49.41 | 35.94 | 28.74 | 270.53 | 135.18 | 68.29 | 34.03 | 17.33 | 9.72 | 5.23 | 3.86 | 270.44 | 135.34 | 68.01 | 34.12 | 17.2 | 8.68 | 4.36 | 3.2 | 2 | 4 | 8 | 16 | 32 | 64 | 128 | 192 | 13.50 | 26.96 | 53.66 | 106.95 | 212.20 | 420.54 | 837.16 | 1140.34 | 44.07 | 18.01 | 1.07 | 2.31 | 4.6 | 1 | 1 | 1.01 | 1 | 1.02 | 1.14 | 1.24 | 1.26 | 1.67 | 0 | 1 | 0 | 0.67 | 1 | 0 | 1 | 0.08 | 0.99 | 0.5 | 0.99 | 0.73 | 0.98 | 1.18 | 0.97 | 1.3 | 0.97 | 1.11 | 0.88 | 3.44 |
152 | exec - timestep.c:74-78 | advanceVelocity.extracted | Innermost | 1.26 | 1.17 | 1.1 | 0.82 | 0.72 | 0.8 | 0.76 | 0.51 | 3.67 | 1.76 | 1.16 | 0.38 | 0.23 | 0.19 | 0.18 | 0.08 | 3.66 | 1.75 | 0.88 | 0.35 | 0.18 | 0.14 | 0.09 | 0.06 | 2 | 4 | 8 | 16 | 32 | 64 | 128 | 192 | 2.61 | 5.45 | 10.84 | 27.11 | 53.27 | 67.36 | 107.07 | 161.59 | 0 | 12.5 | 1 | 1 | 4.8 | 1 | 1.01 | 1.33 | 1.09 | 1.28 | 1.36 | 2 | 1.6 | 0 | 2 | 0 | 0 | 0 | 1 | 0 | 1.05 | 0 | 1.04 | 0 | 1.31 | 0 | 1.27 | 0 | 0.82 | 0.15 | 0.64 | 0.28 | 0.64 | 0.19 |
144 | exec - ljForce.c:187-216 [...] | ljForce.extracted | InBetween | 0.93 | 0.88 | 0.86 | 0.76 | 0.66 | 0.46 | 0.37 | 0.29 | 2.74 | 1.39 | 0.77 | 0.4 | 0.22 | 0.13 | 0.09 | 0.07 | 2.71 | 1.32 | 0.69 | 0.33 | 0.17 | 0.08 | 0.05 | 0.03 | 2 | 4 | 8 | 16 | 32 | 64 | 128 | 192 | 15.00 | 31.74 | 60.27 | 128.02 | 244.98 | 518.87 | 821.71 | 1386.22 | 0 | 12.5 | 1 | 1 | 8 | 1.01 | 1.05 | 1.13 | 1.21 | 1.29 | 1.63 | 2.25 | 2.33 | 0 | 0 | 1 | 1 | 0 | 1 | 0 | 1.03 | 0 | 0.98 | 0.02 | 1.03 | 0 | 1 | 0 | 1.06 | 0 | 0.85 | 0.06 | 0.94 | 0.02 |
84 | exec - haloExchange.c:621-630 | sortAtomsInCell.A | Single | 0.83 | 1.05 | 1.24 | 1.46 | 1.36 | 1.08 | 1.31 | 1.07 | 2.48 | 2.21 | 1.26 | 0.93 | 0.56 | 0.36 | 0.27 | 0.22 | 2.41 | 1.58 | 0.99 | 0.63 | 0.34 | 0.19 | 0.16 | 0.12 | 2 | 4 | 8 | 16 | 32 | 64 | 128 | 192 | 1.39 | 2.20 | 3.51 | 5.46 | 10.29 | 18.54 | 22.77 | 30.35 | 33.33 | 14.58 | 1.5 | 1 | 4.57 | 1.03 | 1.41 | 1.27 | 1.48 | 1.65 | 1.89 | 1.8 | 2 | 0 | 2 | 3 | 0 | 0 | 1 | 0 | 0.76 | 0.25 | 0.61 | 0.49 | 0.48 | 0.76 | 0.44 | 0.76 | 0.4 | 0.65 | 0.24 | 1 | 0.21 | 0.85 |
131 | exec - linkCells.c:295-378 [...] | updateLinkCells.A | Innermost | 0.83 | 0.99 | 1.14 | 0.93 | 0.8 | 0.66 | 0.43 | 0.34 | 2.41 | 3.55 | 4.21 | 3.57 | 3.56 | 4.43 | 3.6 | 4.09 | 2.41 | 1.49 | 0.91 | 0.4 | 0.2 | 0.12 | 0.05 | 0.04 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 5.07 | 8.59 | 13.74 | 32.45 | 61.77 | 108.97 | 268.23 | 314.57 | 46.81 | 17.15 | 1.7 | 1.84 | 5.11 | 1 | 1.2 | 1.16 | 1.11 | 1.11 | 1.23 | 1.13 | 1.17 | NA | NA | NA | NA | NA | 1 | 0 | 0.81 | 0.19 | 0.66 | 0.39 | 0.75 | 0.23 | 0.75 | 0.2 | 0.63 | 0.25 | 0.75 | 0.11 | 0.63 | 0.13 |
154 | exec - timestep.c:88-94 | advancePosition.extracted | Innermost | 0.66 | 0.63 | 0.66 | 0.52 | 0.45 | 0.46 | 0.43 | 0.32 | 1.93 | 0.96 | 0.6 | 0.26 | 0.15 | 0.12 | 0.14 | 0.08 | 1.92 | 0.95 | 0.52 | 0.22 | 0.11 | 0.08 | 0.05 | 0.04 | 2 | 4 | 8 | 16 | 32 | 64 | 128 | 192 | 2.69 | 5.41 | 9.90 | 23.09 | 45.96 | 64.45 | 100.98 | 126.28 | 7.69 | 13.46 | 1 | 1.46 | 4 | 1.01 | 1.01 | 1.15 | 1.18 | 1.36 | 1.5 | 2.8 | 2.67 | 0 | 3 | 0 | 0 | 1 | 1 | 0 | 1.01 | 0 | 0.92 | 0.05 | 1.09 | 0 | 1.09 | 0 | 0.75 | 0.12 | 0.6 | 0.17 | 0.5 | 0.16 |
143 | exec - ljForce.c:178-216 [...] | ljForce.extracted | InBetween | 0.33 | 0.31 | 0.31 | 0.28 | 0.24 | 0.17 | 0.13 | 0.1 | 1.03 | 0.51 | 0.28 | 0.15 | 0.1 | 0.06 | 0.07 | 0.04 | 0.95 | 0.46 | 0.25 | 0.12 | 0.06 | 0.03 | 0.02 | 0.01 | 2 | 4 | 8 | 16 | 32 | 64 | 128 | 192 | 18.85 | 38.57 | 70.81 | 143.03 | 282.48 | 564.92 | 874.89 | 1748.27 | 0 | 11.25 | 1 | 1 | 8 | 1.08 | 1.11 | 1.17 | 1.25 | 1.67 | 2 | 3.5 | 4 | NA | NA | NA | NA | NA | 1 | 0 | 1.03 | 0 | 0.95 | 0.02 | 0.99 | 0 | 0.99 | 0 | 0.99 | 0 | 0.74 | 0.03 | 0.99 | 0 |
147 | exec - ljForce.c:157-158 [...] | ljForce.extracted.27 | Single | 0.27 | 0.36 | 0.54 | 0.77 | 0.91 | 0.72 | 0.88 | 0.51 | 0.79 | 0.68 | 0.61 | 0.74 | 0.47 | 0.25 | 0.22 | 0.15 | 0.78 | 0.53 | 0.43 | 0.33 | 0.23 | 0.13 | 0.11 | 0.06 | 2 | 4 | 8 | 16 | 32 | 64 | 128 | 192 | 2.25 | 3.25 | 4.12 | 5.28 | 7.73 | 13.75 | 15.59 | 29.40 | 50 | 15.63 | 1.33 | 1 | 5.33 | 1.03 | 1.28 | 1.42 | 2.24 | 2.04 | 2.08 | 2.2 | 3 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0.74 | 0.1 | 0.45 | 0.3 | 0.3 | 0.54 | 0.21 | 0.72 | 0.19 | 0.58 | 0.11 | 0.78 | 0.14 | 0.44 |
85 | exec - haloExchange.c:633-642 | sortAtomsInCell.A | Single | 0.14 | 0.25 | 0.45 | 0.75 | 0.7 | 0.58 | 0.64 | 0.52 | 0.43 | 0.71 | 0.48 | 0.6 | 0.35 | 0.27 | 0.21 | 0.17 | 0.39 | 0.38 | 0.36 | 0.32 | 0.18 | 0.1 | 0.08 | 0.06 | 2 | 4 | 8 | 16 | 32 | 64 | 127 | 190 | 1.31 | 1.23 | 1.44 | 1.79 | 3.47 | 6.00 | 6.11 | 8.33 | 33.33 | 14.58 | 1.5 | 1 | 4.57 | 1.1 | 1.92 | 1.33 | 1.88 | 1.94 | 2.7 | 2.63 | 2.83 | 0 | 2 | 3 | 0 | 0 | 1 | 0 | 0.51 | 0.12 | 0.27 | 0.33 | 0.15 | 0.64 | 0.14 | 0.61 | 0.12 | 0.51 | 0.08 | 0.59 | 0.07 | 0.48 |
69 | exec - haloExchange.c:380-389 | loadAtomsBuffer.A | Innermost | 0.09 | 0.11 | 0.11 | 0.11 | 0.09 | 0.06 | 0.04 | 0.03 | 0.3 | 0.36 | 0.4 | 0.38 | 0.38 | 0.4 | 0.32 | 0.34 | 0.27 | 0.17 | 0.09 | 0.05 | 0.02 | 0.01 | 0 | 0 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 1.81 | 2.68 | 5.22 | 10.15 | 23.25 | 45.00 | 0.00 | 0.00 | 38.46 | 15.87 | 1.44 | 1.24 | 4.52 | 1.11 | 1.06 | 1.11 | 1 | 1.09 | 1.14 | 1.07 | 1.17 | 1 | 2 | 3 | 0 | 0 | 1 | 0 | 0.79 | 0.02 | 0.75 | 0.03 | 0.68 | 0.04 | 0.84 | 0.01 | 0.84 | 0.01 | 1 | 0 | 1 | 0 |
159 | exec - timestep.c:110-116 | kineticEnergy.extracted | Innermost | 0.06 | 0.06 | 0.05 | 0.04 | 0.03 | 0.02 | 0.05 | 0.01 | 0.19 | 0.1 | 0.06 | 0.03 | 0.01 | 0.01 | 0.01 | 0 | 0.19 | 0.09 | 0.04 | 0.02 | 0.01 | 0 | 0.01 | 0 | 2 | 4 | 8 | 16 | 32 | 64 | 127 | 177 | 2.91 | 6.26 | 14.44 | 29.13 | 57.50 | 0.00 | 65.50 | 0.00 | 80.77 | 22.12 | 1 | 1.48 | 2 | 1 | 1.11 | 1.5 | 1.5 | 1 | 0 | 1 | 0 | 0 | 2 | 0 | 0 | 4 | 1 | 0 | 1.06 | -0 | 1.19 | 0 | 1.19 | 0 | 1.19 | 0 | 1 | 0 | 0.3 | 0.04 | 1 | 0 |
122 | exec - initAtoms.c:197-202 | randomDisplacements.extracted | Innermost | 0.01 | 0.01 | 0.01 | 0.01 | 0.01 | 0.01 | 0.02 | 0.03 | 0.03 | 0.03 | 0.02 | 0.01 | 0 | 0 | 0.01 | 0 | 0.03 | 0.02 | 0.01 | 0 | 0 | 0 | 0 | 0 | 2 | 4 | 8 | 13 | 18 | 30 | 60 | 111 | 0.58 | 0.75 | 1.63 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0 | 11.96 | 1 | 2.86 | 11.12 | 1 | 1.5 | 2 | 0 | 0 | 0 | 0 | 0 | 3 | 2 | 0 | 0 | 3 | 1 | 0 | 0.75 | 0 | 0.75 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 |