* Info: Detected 2 Lprof instances in gmz11.benchmarkcenter.megware.com: processes-per-node/ppn set accordingly.
If this is incorrect, rerun with an explicit value for this setting
* Info: Selecting the 'perf-high-ppn' engine for node gmz11.benchmarkcenter.megware.com
* Info: "ref-cycles" not supported on gmz11.benchmarkcenter.megware.com: fallback to "cpu-clock"
* Warning: Found no event able to derive walltime: prepending cpu-clock
* Info: Process launched (host gmz11.benchmarkcenter.megware.com, process 11916)
* Info: Process launched (host gmz11.benchmarkcenter.megware.com, process 11920)Fri Feb 23 13:28:22 2024: Starting Initialization
Mini-Application Name : CoMD-openmp-mpi
Mini-Application Version : 1.1
Platform:
hostname: gmz16.benchmarkcenter.megware.com
kernel name: 'Linux'
kernel release: '5.14.0-362.13.1.el9_3.x86_64'
processor: 'x86_64'
Build:
CC: '/cluster/intel/oneapi/2024.0.0/mpi/2021.11/bin/mpiicc'
compiler version: 'unknown'
CFLAGS: '-O3 -march=native -DDO_MPI -O2 -axCORE-AVX512 -mprefer-vector-width=512 -g -grecord-gcc-switches -fno-omit-frame-pointer -fcf-protection=none -no-pie -cc=icx -fiopenmp'
LDFLAGS: ' '
using MPI: true
Threading: OpenMP (1 threads)
Double Precision: true
Run Date/Time: 2024-02-23, 13:28:22
Command Line Parameters:
doeam: 0
potDir: pots
potName: Cu_u6.eam
potType: funcfl
nx: 100
ny: 100
nz: 100
xproc: 2
yproc: 1
zproc: 1
Lattice constant: -1 Angstroms
nSteps: 100
printRate: 10
Time step: 1 fs
Initial Temperature: 600 K
Initial Delta: 0 Angstroms
Simulation data:
Total atoms : 4000000
Min global bounds : [ 0.0000000000, 0.0000000000, 0.0000000000 ]
Max global bounds : [ 361.5000000000, 361.5000000000, 361.5000000000 ]
Decomposition data:
Processors : 2, 1, 1
Local boxes : 31, 62, 62 = 119164
Box size : [ 5.8306451613, 5.8306451613, 5.8306451613 ]
Box factor : [ 1.0074548875, 1.0074548875, 1.0074548875 ]
Max Link Cell Occupancy: 32 of 64
Potential data:
Potential type : Lennard-Jones
Species name : Cu
Atomic number : 29
Mass : 63.55 amu
Lattice Type : FCC
Lattice spacing : 3.615 Angstroms
Cutoff : 5.7875 Angstroms
Epsilon : 0.167 eV
Sigma : 2.315 Angstroms
Initial energy : -1.166063303588, atom count : 4000000
Fri Feb 23 13:28:24 2024: Initialization Finished
Fri Feb 23 13:28:24 2024: Starting simulation
# Performance
# Loop Time(fs) Total Energy Potential Energy Kinetic Energy Temperature (us/atom) # Atoms
0 0.00 -1.166063303588 -1.243619295188 0.077555991600 600.0000 0.0000 4000000
10 10.00 -1.166059650499 -1.233157709948 0.067098059449 519.0938 1.0301 4000000
20 20.00 -1.166048438417 -1.208183014318 0.042134575902 325.9677 1.3583 4000000
30 30.00 -1.166037590737 -1.186586197151 0.020548606414 158.9711 1.4711 4000000
40 40.00 -1.166042093135 -1.183625399859 0.017583306724 136.0305 1.5058 4000000
50 50.00 -1.166051684893 -1.193713710258 0.027662025365 214.0030 1.5160 4000000
60 60.00 -1.166054646931 -1.202662201513 0.036607554582 283.2087 1.5152 4000000
70 70.00 -1.166052143011 -1.204911990844 0.038859847833 300.6332 1.5106 4000000
80 80.00 -1.166048803912 -1.203635015020 0.037586211108 290.7799 1.5033 4000000
90 90.00 -1.166048006780 -1.203820491598 0.037772484818 292.2210 1.4970 4000000
100 100.00 -1.166049793505 -1.206862845061 0.040813051556 315.7439 1.4884 4000000
Fri Feb 23 13:33:12 2024: Ending simulation
Simulation Validation:
Initial energy : -1.166063303588
Final energy : -1.166049793505
eFinal/eInitial : 0.999988
Final atom count : 4000000, no atoms lost
Timings for Rank 0
Timer # Calls Avg/Call (s) Total (s) % Loop
___________________________________________________________________
total 1 290.2312 290.2312 100.80
loop 1 287.9176 287.9176 100.00
timestep 10 28.7917 287.9172 100.00
position 100 0.0194 1.9359 0.67
velocity 200 0.0183 3.6606 1.27
redistribute 101 0.0902 9.1127 3.17
atomHalo 101 0.0103 1.0353 0.36
force 101 2.7210 274.8254 95.45
commHalo 303 0.0005 0.1448 0.05
commReduce 39 0.0006 0.0230 0.01
Timing Statistics Across 2 Ranks:
Timer Rank: Min(s) Rank: Max(s) Avg(s) Stdev(s)
_____________________________________________________________________________
total 0: 290.2312 1: 290.2314 290.2313 0.0001
loop 0: 287.9176 1: 287.9176 287.9176 0.0000
timestep 0: 287.9172 1: 287.9174 287.9173 0.0001
position 1: 1.9306 0: 1.9359 1.9332 0.0027
velocity 1: 3.6411 0: 3.6606 3.6508 0.0097
redistribute 0: 9.1127 1: 9.2217 9.1672 0.0545
atomHalo 0: 1.0353 1: 1.1946 1.1149 0.0796
force 1: 274.7507 0: 274.8254 274.7880 0.0373
commHalo 0: 0.1448 1: 0.2690 0.2069 0.0621
commReduce 1: 0.0118 0: 0.0230 0.0174 0.0056
---------------------------------------------------
Average atom update rate: 1.44 us/atom/task
---------------------------------------------------
---------------------------------------------------
Average all atom update rate: 0.72 us/atom
---------------------------------------------------
---------------------------------------------------
Average atom rate: 1.39 atoms/us
---------------------------------------------------
Fri Feb 23 13:33:12 2024: CoMD Ending
* Info: Process finished (host gmz11.benchmarkcenter.megware.com, process 11920)
* Warning: (host gmz11.benchmarkcenter.megware.com, process 11920) Observed more threads (2) than expected (1): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=2.
* Info: Process finished (host gmz11.benchmarkcenter.megware.com, process 11916)
* Warning: (host gmz11.benchmarkcenter.megware.com, process 11916) Observed more threads (2) than expected (1): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=2.
Your experiment path is /beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_0
To display your profiling results:
##################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
##################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_0 #
# Functions | Per-node | maqao lprof -df -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_0 #
# Functions | Per-process | maqao lprof -df -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_0 #
# Functions | Per-thread | maqao lprof -df -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_0 #
# Loops | Cluster-wide | maqao lprof -dl xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_0 #
# Loops | Per-node | maqao lprof -dl -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_0 #
# Loops | Per-process | maqao lprof -dl -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_0 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_0 #
##################################################################################################################################################################################################
* Info: Detected 2 Lprof instances in gmz11.benchmarkcenter.megware.com: processes-per-node/ppn set accordingly.
If this is incorrect, rerun with an explicit value for this setting
* Info: Selecting the 'perf-high-ppn' engine for node gmz11.benchmarkcenter.megware.com
* Info: "ref-cycles" not supported on gmz11.benchmarkcenter.megware.com: fallback to "cpu-clock"
* Warning: Found no event able to derive walltime: prepending cpu-clock
* Info: Process launched (host gmz11.benchmarkcenter.megware.com, process 12010)
* Info: Process launched (host gmz11.benchmarkcenter.megware.com, process 12009)Fri Feb 23 13:33:24 2024: Starting Initialization
Mini-Application Name : CoMD-openmp-mpi
Mini-Application Version : 1.1
Platform:
hostname: gmz16.benchmarkcenter.megware.com
kernel name: 'Linux'
kernel release: '5.14.0-362.13.1.el9_3.x86_64'
processor: 'x86_64'
Build:
CC: '/cluster/intel/oneapi/2024.0.0/mpi/2021.11/bin/mpiicc'
compiler version: 'unknown'
CFLAGS: '-O3 -march=native -DDO_MPI -O2 -axCORE-AVX512 -mprefer-vector-width=512 -g -grecord-gcc-switches -fno-omit-frame-pointer -fcf-protection=none -no-pie -cc=icx -fiopenmp'
LDFLAGS: ' '
using MPI: true
Threading: OpenMP (2 threads)
Double Precision: true
Run Date/Time: 2024-02-23, 13:33:24
Command Line Parameters:
doeam: 0
potDir: pots
potName: Cu_u6.eam
potType: funcfl
nx: 100
ny: 100
nz: 100
xproc: 2
yproc: 1
zproc: 1
Lattice constant: -1 Angstroms
nSteps: 100
printRate: 10
Time step: 1 fs
Initial Temperature: 600 K
Initial Delta: 0 Angstroms
Simulation data:
Total atoms : 4000000
Min global bounds : [ 0.0000000000, 0.0000000000, 0.0000000000 ]
Max global bounds : [ 361.5000000000, 361.5000000000, 361.5000000000 ]
Decomposition data:
Processors : 2, 1, 1
Local boxes : 31, 62, 62 = 119164
Box size : [ 5.8306451613, 5.8306451613, 5.8306451613 ]
Box factor : [ 1.0074548875, 1.0074548875, 1.0074548875 ]
Max Link Cell Occupancy: 32 of 64
Potential data:
Potential type : Lennard-Jones
Species name : Cu
Atomic number : 29
Mass : 63.55 amu
Lattice Type : FCC
Lattice spacing : 3.615 Angstroms
Cutoff : 5.7875 Angstroms
Epsilon : 0.167 eV
Sigma : 2.315 Angstroms
Initial energy : -1.166063304092, atom count : 4000000
Fri Feb 23 13:33:25 2024: Initialization Finished
Fri Feb 23 13:33:25 2024: Starting simulation
# Performance
# Loop Time(fs) Total Energy Potential Energy Kinetic Energy Temperature (us/atom) # Atoms
0 0.00 -1.166063304092 -1.243619295692 0.077555991600 600.0000 0.0000 4000000
10 10.00 -1.166059650500 -1.233157709949 0.067098059449 519.0938 0.5337 4000000
20 20.00 -1.166048438416 -1.208183014318 0.042134575902 325.9677 0.7034 4000000
30 30.00 -1.166037590737 -1.186586197151 0.020548606414 158.9711 0.7595 4000000
40 40.00 -1.166042093134 -1.183625399859 0.017583306724 136.0305 0.7774 4000000
50 50.00 -1.166051684893 -1.193713710258 0.027662025365 214.0030 0.7835 4000000
60 60.00 -1.166054646931 -1.202662201513 0.036607554582 283.2087 0.7823 4000000
70 70.00 -1.166052143011 -1.204911990844 0.038859847833 300.6332 0.7801 4000000
80 80.00 -1.166048803912 -1.203635015020 0.037586211108 290.7799 0.7767 4000000
90 90.00 -1.166048006780 -1.203820491599 0.037772484818 292.2210 0.7729 4000000
100 100.00 -1.166049793504 -1.206862845060 0.040813051556 315.7439 0.7701 4000000
Fri Feb 23 13:35:54 2024: Ending simulation
Simulation Validation:
Initial energy : -1.166063304092
Final energy : -1.166049793504
eFinal/eInitial : 0.999988
Final atom count : 4000000, no atoms lost
Timings for Rank 0
Timer # Calls Avg/Call (s) Total (s) % Loop
___________________________________________________________________
total 1 150.1208 150.1208 100.89
loop 1 148.7936 148.7936 100.00
timestep 10 14.8793 148.7931 100.00
position 100 0.0093 0.9306 0.63
velocity 200 0.0089 1.7861 1.20
redistribute 101 0.0917 9.2586 6.22
atomHalo 101 0.0340 3.4350 2.31
force 101 1.3616 137.5257 92.43
commHalo 303 0.0083 2.5238 1.70
commReduce 39 0.0050 0.1963 0.13
Timing Statistics Across 2 Ranks:
Timer Rank: Min(s) Rank: Max(s) Avg(s) Stdev(s)
_____________________________________________________________________________
total 0: 150.1208 1: 150.1210 150.1209 0.0001
loop 0: 148.7936 1: 148.7936 148.7936 0.0000
timestep 0: 148.7931 1: 148.7934 148.7933 0.0001
position 0: 0.9306 1: 0.9579 0.9443 0.0136
velocity 1: 1.7702 0: 1.7861 1.7781 0.0079
redistribute 0: 9.2586 1: 9.4107 9.3347 0.0760
atomHalo 1: 1.2205 0: 3.4350 2.3278 1.1073
force 0: 137.5257 1: 137.5312 137.5285 0.0027
commHalo 1: 0.1074 0: 2.5238 1.3156 1.2082
commReduce 1: 0.0006 0: 0.1963 0.0985 0.0978
---------------------------------------------------
Average atom update rate: 0.74 us/atom/task
---------------------------------------------------
---------------------------------------------------
Average all atom update rate: 0.37 us/atom
---------------------------------------------------
---------------------------------------------------
Average atom rate: 2.69 atoms/us
---------------------------------------------------
Fri Feb 23 13:35:54 2024: CoMD Ending
* Info: Process finished (host gmz11.benchmarkcenter.megware.com, process 12010)
* Warning: (host gmz11.benchmarkcenter.megware.com, process 12010) Observed more threads (3) than expected (2): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=3.
* Info: Process finished (host gmz11.benchmarkcenter.megware.com, process 12009)
* Warning: (host gmz11.benchmarkcenter.megware.com, process 12009) Observed more threads (3) than expected (2): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=3.
Your experiment path is /beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_1
To display your profiling results:
##################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
##################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_1 #
# Functions | Per-node | maqao lprof -df -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_1 #
# Functions | Per-process | maqao lprof -df -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_1 #
# Functions | Per-thread | maqao lprof -df -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_1 #
# Loops | Cluster-wide | maqao lprof -dl xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_1 #
# Loops | Per-node | maqao lprof -dl -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_1 #
# Loops | Per-process | maqao lprof -dl -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_1 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_1 #
##################################################################################################################################################################################################
* Info: Detected 2 Lprof instances in gmz11.benchmarkcenter.megware.com: processes-per-node/ppn set accordingly.
If this is incorrect, rerun with an explicit value for this setting
* Info: Selecting the 'perf-high-ppn' engine for node gmz11.benchmarkcenter.megware.com
* Info: "ref-cycles" not supported on gmz11.benchmarkcenter.megware.com: fallback to "cpu-clock"
* Warning: Found no event able to derive walltime: prepending cpu-clock
* Info: Process launched (host gmz11.benchmarkcenter.megware.com, process 12095)
* Info: Process launched (host gmz11.benchmarkcenter.megware.com, process 12096)Fri Feb 23 13:36:04 2024: Starting Initialization
Mini-Application Name : CoMD-openmp-mpi
Mini-Application Version : 1.1
Platform:
hostname: gmz16.benchmarkcenter.megware.com
kernel name: 'Linux'
kernel release: '5.14.0-362.13.1.el9_3.x86_64'
processor: 'x86_64'
Build:
CC: '/cluster/intel/oneapi/2024.0.0/mpi/2021.11/bin/mpiicc'
compiler version: 'unknown'
CFLAGS: '-O3 -march=native -DDO_MPI -O2 -axCORE-AVX512 -mprefer-vector-width=512 -g -grecord-gcc-switches -fno-omit-frame-pointer -fcf-protection=none -no-pie -cc=icx -fiopenmp'
LDFLAGS: ' '
using MPI: true
Threading: OpenMP (4 threads)
Double Precision: true
Run Date/Time: 2024-02-23, 13:36:04
Command Line Parameters:
doeam: 0
potDir: pots
potName: Cu_u6.eam
potType: funcfl
nx: 100
ny: 100
nz: 100
xproc: 2
yproc: 1
zproc: 1
Lattice constant: -1 Angstroms
nSteps: 100
printRate: 10
Time step: 1 fs
Initial Temperature: 600 K
Initial Delta: 0 Angstroms
Simulation data:
Total atoms : 4000000
Min global bounds : [ 0.0000000000, 0.0000000000, 0.0000000000 ]
Max global bounds : [ 361.5000000000, 361.5000000000, 361.5000000000 ]
Decomposition data:
Processors : 2, 1, 1
Local boxes : 31, 62, 62 = 119164
Box size : [ 5.8306451613, 5.8306451613, 5.8306451613 ]
Box factor : [ 1.0074548875, 1.0074548875, 1.0074548875 ]
Max Link Cell Occupancy: 32 of 64
Potential data:
Potential type : Lennard-Jones
Species name : Cu
Atomic number : 29
Mass : 63.55 amu
Lattice Type : FCC
Lattice spacing : 3.615 Angstroms
Cutoff : 5.7875 Angstroms
Epsilon : 0.167 eV
Sigma : 2.315 Angstroms
Initial energy : -1.166063303067, atom count : 4000000
Fri Feb 23 13:36:05 2024: Initialization Finished
Fri Feb 23 13:36:05 2024: Starting simulation
# Performance
# Loop Time(fs) Total Energy Potential Energy Kinetic Energy Temperature (us/atom) # Atoms
0 0.00 -1.166063303067 -1.243619294667 0.077555991600 600.0000 0.0000 4000000
10 10.00 -1.166059650499 -1.233157709949 0.067098059449 519.0938 0.2922 4000000
20 20.00 -1.166048438416 -1.208183014318 0.042134575902 325.9677 0.3730 4000000
30 30.00 -1.166037590737 -1.186586197151 0.020548606414 158.9711 0.4023 4000000
40 40.00 -1.166042093134 -1.183625399859 0.017583306724 136.0305 0.4128 4000000
50 50.00 -1.166051684893 -1.193713710257 0.027662025365 214.0030 0.4161 4000000
60 60.00 -1.166054646931 -1.202662201513 0.036607554582 283.2087 0.4158 4000000
70 70.00 -1.166052143011 -1.204911990845 0.038859847833 300.6332 0.4142 4000000
80 80.00 -1.166048803912 -1.203635015020 0.037586211108 290.7799 0.4128 4000000
90 90.00 -1.166048006780 -1.203820491599 0.037772484818 292.2210 0.4111 4000000
100 100.00 -1.166049793504 -1.206862845060 0.040813051556 315.7439 0.4088 4000000
Fri Feb 23 13:37:24 2024: Ending simulation
Simulation Validation:
Initial energy : -1.166063303067
Final energy : -1.166049793504
eFinal/eInitial : 0.999988
Final atom count : 4000000, no atoms lost
Timings for Rank 0
Timer # Calls Avg/Call (s) Total (s) % Loop
___________________________________________________________________
total 1 79.9769 79.9769 101.00
loop 1 79.1814 79.1814 100.00
timestep 10 7.9181 79.1808 100.00
position 100 0.0054 0.5364 0.68
velocity 200 0.0059 1.1738 1.48
redistribute 101 0.0839 8.4770 10.71
atomHalo 101 0.0264 2.6687 3.37
force 101 0.6876 69.4460 87.70
commHalo 303 0.0054 1.6325 2.06
commReduce 39 0.0007 0.0270 0.03
Timing Statistics Across 2 Ranks:
Timer Rank: Min(s) Rank: Max(s) Avg(s) Stdev(s)
_____________________________________________________________________________
total 0: 79.9769 1: 79.9770 79.9770 0.0000
loop 0: 79.1814 1: 79.1814 79.1814 0.0000
timestep 0: 79.1808 1: 79.1811 79.1810 0.0001
position 0: 0.5364 1: 0.6291 0.5828 0.0463
velocity 1: 0.9414 0: 1.1738 1.0576 0.1162
redistribute 1: 8.4041 0: 8.4770 8.4405 0.0364
atomHalo 1: 1.1873 0: 2.6687 1.9280 0.7407
force 0: 69.4460 1: 69.6783 69.5621 0.1161
commHalo 1: 0.0963 0: 1.6325 0.8644 0.7681
commReduce 1: 0.0049 0: 0.0270 0.0160 0.0111
---------------------------------------------------
Average atom update rate: 0.40 us/atom/task
---------------------------------------------------
---------------------------------------------------
Average all atom update rate: 0.20 us/atom
---------------------------------------------------
---------------------------------------------------
Average atom rate: 5.05 atoms/us
---------------------------------------------------
Fri Feb 23 13:37:24 2024: CoMD Ending
* Info: Process finished (host gmz11.benchmarkcenter.megware.com, process 12096)
* Warning: (host gmz11.benchmarkcenter.megware.com, process 12096) Observed more threads (5) than expected (4): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=5.
* Info: Process finished (host gmz11.benchmarkcenter.megware.com, process 12095)
* Warning: (host gmz11.benchmarkcenter.megware.com, process 12095) Observed more threads (5) than expected (4): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=5.
Your experiment path is /beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_2
To display your profiling results:
##################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
##################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_2 #
# Functions | Per-node | maqao lprof -df -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_2 #
# Functions | Per-process | maqao lprof -df -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_2 #
# Functions | Per-thread | maqao lprof -df -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_2 #
# Loops | Cluster-wide | maqao lprof -dl xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_2 #
# Loops | Per-node | maqao lprof -dl -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_2 #
# Loops | Per-process | maqao lprof -dl -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_2 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_2 #
##################################################################################################################################################################################################
* Info: Detected 2 Lprof instances in gmz11.benchmarkcenter.megware.com: processes-per-node/ppn set accordingly.
If this is incorrect, rerun with an explicit value for this setting
* Info: Selecting the 'perf-high-ppn' engine for node gmz11.benchmarkcenter.megware.com
* Info: "ref-cycles" not supported on gmz11.benchmarkcenter.megware.com: fallback to "cpu-clock"
* Warning: Found no event able to derive walltime: prepending cpu-clock
* Info: Process launched (host gmz11.benchmarkcenter.megware.com, process 12188)
* Info: Process launched (host gmz11.benchmarkcenter.megware.com, process 12187)Fri Feb 23 13:37:36 2024: Starting Initialization
Mini-Application Name : CoMD-openmp-mpi
Mini-Application Version : 1.1
Platform:
hostname: gmz16.benchmarkcenter.megware.com
kernel name: 'Linux'
kernel release: '5.14.0-362.13.1.el9_3.x86_64'
processor: 'x86_64'
Build:
CC: '/cluster/intel/oneapi/2024.0.0/mpi/2021.11/bin/mpiicc'
compiler version: 'unknown'
CFLAGS: '-O3 -march=native -DDO_MPI -O2 -axCORE-AVX512 -mprefer-vector-width=512 -g -grecord-gcc-switches -fno-omit-frame-pointer -fcf-protection=none -no-pie -cc=icx -fiopenmp'
LDFLAGS: ' '
using MPI: true
Threading: OpenMP (8 threads)
Double Precision: true
Run Date/Time: 2024-02-23, 13:37:36
Command Line Parameters:
doeam: 0
potDir: pots
potName: Cu_u6.eam
potType: funcfl
nx: 100
ny: 100
nz: 100
xproc: 2
yproc: 1
zproc: 1
Lattice constant: -1 Angstroms
nSteps: 100
printRate: 10
Time step: 1 fs
Initial Temperature: 600 K
Initial Delta: 0 Angstroms
Simulation data:
Total atoms : 4000000
Min global bounds : [ 0.0000000000, 0.0000000000, 0.0000000000 ]
Max global bounds : [ 361.5000000000, 361.5000000000, 361.5000000000 ]
Decomposition data:
Processors : 2, 1, 1
Local boxes : 31, 62, 62 = 119164
Box size : [ 5.8306451613, 5.8306451613, 5.8306451613 ]
Box factor : [ 1.0074548875, 1.0074548875, 1.0074548875 ]
Max Link Cell Occupancy: 32 of 64
Potential data:
Potential type : Lennard-Jones
Species name : Cu
Atomic number : 29
Mass : 63.55 amu
Lattice Type : FCC
Lattice spacing : 3.615 Angstroms
Cutoff : 5.7875 Angstroms
Epsilon : 0.167 eV
Sigma : 2.315 Angstroms
Initial energy : -1.166063303663, atom count : 4000000
Fri Feb 23 13:37:36 2024: Initialization Finished
Fri Feb 23 13:37:36 2024: Starting simulation
# Performance
# Loop Time(fs) Total Energy Potential Energy Kinetic Energy Temperature (us/atom) # Atoms
0 0.00 -1.166063303663 -1.243619295263 0.077555991600 600.0000 0.0000 4000000
10 10.00 -1.166059650500 -1.233157709949 0.067098059449 519.0938 0.1702 4000000
20 20.00 -1.166048438416 -1.208183014318 0.042134575902 325.9677 0.2057 4000000
30 30.00 -1.166037590737 -1.186586197151 0.020548606414 158.9711 0.2176 4000000
40 40.00 -1.166042093134 -1.183625399859 0.017583306724 136.0305 0.2220 4000000
50 50.00 -1.166051684893 -1.193713710258 0.027662025365 214.0030 0.2233 4000000
60 60.00 -1.166054646931 -1.202662201513 0.036607554582 283.2087 0.2229 4000000
70 70.00 -1.166052143011 -1.204911990845 0.038859847833 300.6332 0.2226 4000000
80 80.00 -1.166048803912 -1.203635015020 0.037586211108 290.7799 0.2216 4000000
90 90.00 -1.166048006780 -1.203820491599 0.037772484818 292.2210 0.2205 4000000
100 100.00 -1.166049793504 -1.206862845060 0.040813051556 315.7439 0.2193 4000000
Fri Feb 23 13:38:19 2024: Ending simulation
Simulation Validation:
Initial energy : -1.166063303663
Final energy : -1.166049793504
eFinal/eInitial : 0.999988
Final atom count : 4000000, no atoms lost
Timings for Rank 0
Timer # Calls Avg/Call (s) Total (s) % Loop
___________________________________________________________________
total 1 43.4787 43.4787 101.31
loop 1 42.9163 42.9163 100.00
timestep 10 4.2916 42.9158 100.00
position 100 0.0025 0.2472 0.58
velocity 200 0.0019 0.3886 0.91
redistribute 101 0.0703 7.1013 16.55
atomHalo 101 0.0179 1.8114 4.22
force 101 0.3512 35.4752 82.66
commHalo 303 0.0022 0.6518 1.52
commReduce 39 0.0008 0.0305 0.07
Timing Statistics Across 2 Ranks:
Timer Rank: Min(s) Rank: Max(s) Avg(s) Stdev(s)
_____________________________________________________________________________
total 0: 43.4787 1: 43.4788 43.4787 0.0001
loop 0: 42.9163 1: 42.9163 42.9163 0.0000
timestep 0: 42.9158 1: 42.9160 42.9159 0.0001
position 1: 0.2348 0: 0.2472 0.2410 0.0062
velocity 0: 0.3886 1: 0.3892 0.3889 0.0003
redistribute 0: 7.1013 1: 7.2633 7.1823 0.0810
atomHalo 1: 1.3852 0: 1.8114 1.5983 0.2131
force 1: 35.2843 0: 35.4752 35.3797 0.0954
commHalo 1: 0.1518 0: 0.6518 0.4018 0.2500
commReduce 0: 0.0305 1: 0.0455 0.0380 0.0075
---------------------------------------------------
Average atom update rate: 0.21 us/atom/task
---------------------------------------------------
---------------------------------------------------
Average all atom update rate: 0.11 us/atom
---------------------------------------------------
---------------------------------------------------
Average atom rate: 9.32 atoms/us
---------------------------------------------------
Fri Feb 23 13:38:19 2024: CoMD Ending
* Info: Process finished (host gmz11.benchmarkcenter.megware.com, process 12188)
* Warning: (host gmz11.benchmarkcenter.megware.com, process 12188) Observed more threads (9) than expected (8): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=9.
* Info: Process finished (host gmz11.benchmarkcenter.megware.com, process 12187)
* Warning: (host gmz11.benchmarkcenter.megware.com, process 12187) Observed more threads (9) than expected (8): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=9.
Your experiment path is /beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_3
To display your profiling results:
##################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
##################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_3 #
# Functions | Per-node | maqao lprof -df -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_3 #
# Functions | Per-process | maqao lprof -df -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_3 #
# Functions | Per-thread | maqao lprof -df -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_3 #
# Loops | Cluster-wide | maqao lprof -dl xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_3 #
# Loops | Per-node | maqao lprof -dl -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_3 #
# Loops | Per-process | maqao lprof -dl -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_3 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_3 #
##################################################################################################################################################################################################
* Info: Detected 2 Lprof instances in gmz11.benchmarkcenter.megware.com: processes-per-node/ppn set accordingly.
If this is incorrect, rerun with an explicit value for this setting
* Info: Selecting the 'perf-high-ppn' engine for node gmz11.benchmarkcenter.megware.com
* Info: "ref-cycles" not supported on gmz11.benchmarkcenter.megware.com: fallback to "cpu-clock"
* Warning: Found no event able to derive walltime: prepending cpu-clock
* Info: Process launched (host gmz11.benchmarkcenter.megware.com, process 12270)
* Info: Process launched (host gmz11.benchmarkcenter.megware.com, process 12271)Fri Feb 23 13:38:31 2024: Starting Initialization
Mini-Application Name : CoMD-openmp-mpi
Mini-Application Version : 1.1
Platform:
hostname: gmz16.benchmarkcenter.megware.com
kernel name: 'Linux'
kernel release: '5.14.0-362.13.1.el9_3.x86_64'
processor: 'x86_64'
Build:
CC: '/cluster/intel/oneapi/2024.0.0/mpi/2021.11/bin/mpiicc'
compiler version: 'unknown'
CFLAGS: '-O3 -march=native -DDO_MPI -O2 -axCORE-AVX512 -mprefer-vector-width=512 -g -grecord-gcc-switches -fno-omit-frame-pointer -fcf-protection=none -no-pie -cc=icx -fiopenmp'
LDFLAGS: ' '
using MPI: true
Threading: OpenMP (16 threads)
Double Precision: true
Run Date/Time: 2024-02-23, 13:38:31
Command Line Parameters:
doeam: 0
potDir: pots
potName: Cu_u6.eam
potType: funcfl
nx: 100
ny: 100
nz: 100
xproc: 2
yproc: 1
zproc: 1
Lattice constant: -1 Angstroms
nSteps: 100
printRate: 10
Time step: 1 fs
Initial Temperature: 600 K
Initial Delta: 0 Angstroms
Simulation data:
Total atoms : 4000000
Min global bounds : [ 0.0000000000, 0.0000000000, 0.0000000000 ]
Max global bounds : [ 361.5000000000, 361.5000000000, 361.5000000000 ]
Decomposition data:
Processors : 2, 1, 1
Local boxes : 31, 62, 62 = 119164
Box size : [ 5.8306451613, 5.8306451613, 5.8306451613 ]
Box factor : [ 1.0074548875, 1.0074548875, 1.0074548875 ]
Max Link Cell Occupancy: 32 of 64
Potential data:
Potential type : Lennard-Jones
Species name : Cu
Atomic number : 29
Mass : 63.55 amu
Lattice Type : FCC
Lattice spacing : 3.615 Angstroms
Cutoff : 5.7875 Angstroms
Epsilon : 0.167 eV
Sigma : 2.315 Angstroms
Initial energy : -1.166063303487, atom count : 4000000
Fri Feb 23 13:38:31 2024: Initialization Finished
Fri Feb 23 13:38:31 2024: Starting simulation
# Performance
# Loop Time(fs) Total Energy Potential Energy Kinetic Energy Temperature (us/atom) # Atoms
0 0.00 -1.166063303487 -1.243619295087 0.077555991600 600.0000 0.0000 4000000
10 10.00 -1.166059650500 -1.233157709949 0.067098059449 519.0938 0.1051 4000000
20 20.00 -1.166048438416 -1.208183014318 0.042134575902 325.9677 0.1216 4000000
30 30.00 -1.166037590737 -1.186586197151 0.020548606414 158.9711 0.1270 4000000
40 40.00 -1.166042093134 -1.183625399859 0.017583306724 136.0305 0.1285 4000000
50 50.00 -1.166051684893 -1.193713710257 0.027662025365 214.0030 0.1295 4000000
60 60.00 -1.166054646931 -1.202662201513 0.036607554582 283.2087 0.1294 4000000
70 70.00 -1.166052143011 -1.204911990844 0.038859847833 300.6332 0.1291 4000000
80 80.00 -1.166048803912 -1.203635015020 0.037586211108 290.7799 0.1283 4000000
90 90.00 -1.166048006780 -1.203820491599 0.037772484818 292.2210 0.1283 4000000
100 100.00 -1.166049793504 -1.206862845060 0.040813051556 315.7439 0.1273 4000000
Fri Feb 23 13:38:56 2024: Ending simulation
Simulation Validation:
Initial energy : -1.166063303487
Final energy : -1.166049793504
eFinal/eInitial : 0.999988
Final atom count : 4000000, no atoms lost
Timings for Rank 0
Timer # Calls Avg/Call (s) Total (s) % Loop
___________________________________________________________________
total 1 25.4964 25.4964 101.66
loop 1 25.0809 25.0809 100.00
timestep 10 2.5080 25.0804 100.00
position 100 0.0014 0.1417 0.56
velocity 200 0.0011 0.2264 0.90
redistribute 101 0.0660 6.6648 26.57
atomHalo 101 0.0218 2.1968 8.76
force 101 0.1805 18.2329 72.70
commHalo 303 0.0028 0.8397 3.35
commReduce 39 0.0001 0.0048 0.02
Timing Statistics Across 2 Ranks:
Timer Rank: Min(s) Rank: Max(s) Avg(s) Stdev(s)
_____________________________________________________________________________
total 0: 25.4964 1: 25.4965 25.4964 0.0001
loop 0: 25.0809 1: 25.0809 25.0809 0.0000
timestep 0: 25.0804 1: 25.0806 25.0805 0.0001
position 0: 0.1417 1: 0.1452 0.1434 0.0018
velocity 0: 0.2264 1: 0.2394 0.2329 0.0065
redistribute 0: 6.6648 1: 6.7717 6.7182 0.0535
atomHalo 1: 1.5272 0: 2.1968 1.8620 0.3348
force 1: 18.0918 0: 18.2329 18.1624 0.0705
commHalo 1: 0.1127 0: 0.8397 0.4762 0.3635
commReduce 0: 0.0048 1: 0.0203 0.0125 0.0078
---------------------------------------------------
Average atom update rate: 0.13 us/atom/task
---------------------------------------------------
---------------------------------------------------
Average all atom update rate: 0.06 us/atom
---------------------------------------------------
---------------------------------------------------
Average atom rate: 15.95 atoms/us
---------------------------------------------------
Fri Feb 23 13:38:56 2024: CoMD Ending
* Info: Process finished (host gmz11.benchmarkcenter.megware.com, process 12271)
* Warning: (host gmz11.benchmarkcenter.megware.com, process 12271) Observed more threads (17) than expected (16): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=17.
* Info: Process finished (host gmz11.benchmarkcenter.megware.com, process 12270)
* Warning: (host gmz11.benchmarkcenter.megware.com, process 12270) Observed more threads (17) than expected (16): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=17.
Your experiment path is /beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_4
To display your profiling results:
##################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
##################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_4 #
# Functions | Per-node | maqao lprof -df -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_4 #
# Functions | Per-process | maqao lprof -df -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_4 #
# Functions | Per-thread | maqao lprof -df -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_4 #
# Loops | Cluster-wide | maqao lprof -dl xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_4 #
# Loops | Per-node | maqao lprof -dl -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_4 #
# Loops | Per-process | maqao lprof -dl -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_4 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_4 #
##################################################################################################################################################################################################
* Info: Detected 2 Lprof instances in gmz11.benchmarkcenter.megware.com: processes-per-node/ppn set accordingly.
If this is incorrect, rerun with an explicit value for this setting
* Info: Selecting the 'perf-high-ppn' engine for node gmz11.benchmarkcenter.megware.com
* Info: "ref-cycles" not supported on gmz11.benchmarkcenter.megware.com: fallback to "cpu-clock"
* Warning: Found no event able to derive walltime: prepending cpu-clock
* Info: Process launched (host gmz11.benchmarkcenter.megware.com, process 12404)
* Info: Process launched (host gmz11.benchmarkcenter.megware.com, process 12405)Fri Feb 23 13:39:08 2024: Starting Initialization
Mini-Application Name : CoMD-openmp-mpi
Mini-Application Version : 1.1
Platform:
hostname: gmz16.benchmarkcenter.megware.com
kernel name: 'Linux'
kernel release: '5.14.0-362.13.1.el9_3.x86_64'
processor: 'x86_64'
Build:
CC: '/cluster/intel/oneapi/2024.0.0/mpi/2021.11/bin/mpiicc'
compiler version: 'unknown'
CFLAGS: '-O3 -march=native -DDO_MPI -O2 -axCORE-AVX512 -mprefer-vector-width=512 -g -grecord-gcc-switches -fno-omit-frame-pointer -fcf-protection=none -no-pie -cc=icx -fiopenmp'
LDFLAGS: ' '
using MPI: true
Threading: OpenMP (32 threads)
Double Precision: true
Run Date/Time: 2024-02-23, 13:39:08
Command Line Parameters:
doeam: 0
potDir: pots
potName: Cu_u6.eam
potType: funcfl
nx: 100
ny: 100
nz: 100
xproc: 2
yproc: 1
zproc: 1
Lattice constant: -1 Angstroms
nSteps: 100
printRate: 10
Time step: 1 fs
Initial Temperature: 600 K
Initial Delta: 0 Angstroms
Simulation data:
Total atoms : 4000000
Min global bounds : [ 0.0000000000, 0.0000000000, 0.0000000000 ]
Max global bounds : [ 361.5000000000, 361.5000000000, 361.5000000000 ]
Decomposition data:
Processors : 2, 1, 1
Local boxes : 31, 62, 62 = 119164
Box size : [ 5.8306451613, 5.8306451613, 5.8306451613 ]
Box factor : [ 1.0074548875, 1.0074548875, 1.0074548875 ]
Max Link Cell Occupancy: 32 of 64
Potential data:
Potential type : Lennard-Jones
Species name : Cu
Atomic number : 29
Mass : 63.55 amu
Lattice Type : FCC
Lattice spacing : 3.615 Angstroms
Cutoff : 5.7875 Angstroms
Epsilon : 0.167 eV
Sigma : 2.315 Angstroms
Initial energy : -1.166063303522, atom count : 4000000
Fri Feb 23 13:39:09 2024: Initialization Finished
Fri Feb 23 13:39:09 2024: Starting simulation
# Performance
# Loop Time(fs) Total Energy Potential Energy Kinetic Energy Temperature (us/atom) # Atoms
0 0.00 -1.166063303522 -1.243619295122 0.077555991600 600.0000 0.0000 4000000
10 10.00 -1.166059650500 -1.233157709949 0.067098059449 519.0938 0.0802 4000000
20 20.00 -1.166048438416 -1.208183014318 0.042134575902 325.9677 0.0870 4000000
30 30.00 -1.166037590737 -1.186586197151 0.020548606414 158.9711 0.0899 4000000
40 40.00 -1.166042093134 -1.183625399859 0.017583306724 136.0305 0.0892 4000000
50 50.00 -1.166051684893 -1.193713710258 0.027662025365 214.0030 0.0890 4000000
60 60.00 -1.166054646931 -1.202662201513 0.036607554582 283.2087 0.0888 4000000
70 70.00 -1.166052143011 -1.204911990844 0.038859847833 300.6332 0.0889 4000000
80 80.00 -1.166048803912 -1.203635015020 0.037586211108 290.7799 0.0889 4000000
90 90.00 -1.166048006780 -1.203820491599 0.037772484818 292.2210 0.0883 4000000
100 100.00 -1.166049793504 -1.206862845060 0.040813051556 315.7439 0.0880 4000000
Fri Feb 23 13:39:26 2024: Ending simulation
Simulation Validation:
Initial energy : -1.166063303522
Final energy : -1.166049793504
eFinal/eInitial : 0.999988
Final atom count : 4000000, no atoms lost
Timings for Rank 0
Timer # Calls Avg/Call (s) Total (s) % Loop
___________________________________________________________________
total 1 17.9214 17.9214 102.02
loop 1 17.5657 17.5657 100.00
timestep 10 1.7565 17.5653 100.00
position 100 0.0010 0.0991 0.56
velocity 200 0.0008 0.1648 0.94
redistribute 101 0.0720 7.2703 41.39
atomHalo 101 0.0326 3.2973 18.77
force 101 0.1004 10.1373 57.71
commHalo 303 0.0065 1.9565 11.14
commReduce 39 0.0008 0.0304 0.17
Timing Statistics Across 2 Ranks:
Timer Rank: Min(s) Rank: Max(s) Avg(s) Stdev(s)
_____________________________________________________________________________
total 0: 17.9214 1: 17.9216 17.9215 0.0001
loop 1: 17.5657 0: 17.5657 17.5657 0.0000
timestep 0: 17.5653 1: 17.5655 17.5654 0.0001
position 0: 0.0991 1: 0.1132 0.1062 0.0071
velocity 0: 0.1648 1: 0.1879 0.1764 0.0116
redistribute 0: 7.2703 1: 7.3658 7.3180 0.0477
atomHalo 1: 1.6138 0: 3.2973 2.4555 0.8418
force 1: 10.0301 0: 10.1373 10.0837 0.0536
commHalo 1: 0.1012 0: 1.9565 1.0288 0.9276
commReduce 1: 0.0014 0: 0.0304 0.0159 0.0145
---------------------------------------------------
Average atom update rate: 0.09 us/atom/task
---------------------------------------------------
---------------------------------------------------
Average all atom update rate: 0.04 us/atom
---------------------------------------------------
---------------------------------------------------
Average atom rate: 22.77 atoms/us
---------------------------------------------------
Fri Feb 23 13:39:26 2024: CoMD Ending
* Info: Process finished (host gmz11.benchmarkcenter.megware.com, process 12405)
* Warning: (host gmz11.benchmarkcenter.megware.com, process 12405) Observed more threads (33) than expected (32): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=33.
* Info: Process finished (host gmz11.benchmarkcenter.megware.com, process 12404)
* Warning: (host gmz11.benchmarkcenter.megware.com, process 12404) Observed more threads (33) than expected (32): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=33.
Your experiment path is /beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_5
To display your profiling results:
##################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
##################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_5 #
# Functions | Per-node | maqao lprof -df -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_5 #
# Functions | Per-process | maqao lprof -df -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_5 #
# Functions | Per-thread | maqao lprof -df -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_5 #
# Loops | Cluster-wide | maqao lprof -dl xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_5 #
# Loops | Per-node | maqao lprof -dl -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_5 #
# Loops | Per-process | maqao lprof -dl -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_5 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_5 #
##################################################################################################################################################################################################
* Info: Detected 2 Lprof instances in gmz11.benchmarkcenter.megware.com: processes-per-node/ppn set accordingly.
If this is incorrect, rerun with an explicit value for this setting
* Info: Selecting the 'perf-high-ppn' engine for node gmz11.benchmarkcenter.megware.com
* Info: "ref-cycles" not supported on gmz11.benchmarkcenter.megware.com: fallback to "cpu-clock"
* Warning: Found no event able to derive walltime: prepending cpu-clock
* Info: Process launched (host gmz11.benchmarkcenter.megware.com, process 12585)
* Info: Process launched (host gmz11.benchmarkcenter.megware.com, process 12586)Fri Feb 23 13:39:38 2024: Starting Initialization
Mini-Application Name : CoMD-openmp-mpi
Mini-Application Version : 1.1
Platform:
hostname: gmz16.benchmarkcenter.megware.com
kernel name: 'Linux'
kernel release: '5.14.0-362.13.1.el9_3.x86_64'
processor: 'x86_64'
Build:
CC: '/cluster/intel/oneapi/2024.0.0/mpi/2021.11/bin/mpiicc'
compiler version: 'unknown'
CFLAGS: '-O3 -march=native -DDO_MPI -O2 -axCORE-AVX512 -mprefer-vector-width=512 -g -grecord-gcc-switches -fno-omit-frame-pointer -fcf-protection=none -no-pie -cc=icx -fiopenmp'
LDFLAGS: ' '
using MPI: true
Threading: OpenMP (64 threads)
Double Precision: true
Run Date/Time: 2024-02-23, 13:39:38
Command Line Parameters:
doeam: 0
potDir: pots
potName: Cu_u6.eam
potType: funcfl
nx: 100
ny: 100
nz: 100
xproc: 2
yproc: 1
zproc: 1
Lattice constant: -1 Angstroms
nSteps: 100
printRate: 10
Time step: 1 fs
Initial Temperature: 600 K
Initial Delta: 0 Angstroms
Simulation data:
Total atoms : 4000000
Min global bounds : [ 0.0000000000, 0.0000000000, 0.0000000000 ]
Max global bounds : [ 361.5000000000, 361.5000000000, 361.5000000000 ]
Decomposition data:
Processors : 2, 1, 1
Local boxes : 31, 62, 62 = 119164
Box size : [ 5.8306451613, 5.8306451613, 5.8306451613 ]
Box factor : [ 1.0074548875, 1.0074548875, 1.0074548875 ]
Max Link Cell Occupancy: 32 of 64
Potential data:
Potential type : Lennard-Jones
Species name : Cu
Atomic number : 29
Mass : 63.55 amu
Lattice Type : FCC
Lattice spacing : 3.615 Angstroms
Cutoff : 5.7875 Angstroms
Epsilon : 0.167 eV
Sigma : 2.315 Angstroms
Initial energy : -1.166063303459, atom count : 4000000
Fri Feb 23 13:39:39 2024: Initialization Finished
Fri Feb 23 13:39:39 2024: Starting simulation
# Performance
# Loop Time(fs) Total Energy Potential Energy Kinetic Energy Temperature (us/atom) # Atoms
0 0.00 -1.166063303459 -1.243619295059 0.077555991600 600.0000 0.0000 4000000
10 10.00 -1.166059650500 -1.233157709949 0.067098059449 519.0938 0.0629 4000000
20 20.00 -1.166048438416 -1.208183014318 0.042134575902 325.9677 0.0618 4000000
30 30.00 -1.166037590737 -1.186586197151 0.020548606414 158.9711 0.0602 4000000
40 40.00 -1.166042093134 -1.183625399859 0.017583306724 136.0305 0.0603 4000000
50 50.00 -1.166051684893 -1.193713710258 0.027662025365 214.0030 0.0608 4000000
60 60.00 -1.166054646931 -1.202662201513 0.036607554582 283.2087 0.0600 4000000
70 70.00 -1.166052143011 -1.204911990844 0.038859847833 300.6332 0.0596 4000000
80 80.00 -1.166048803912 -1.203635015020 0.037586211108 290.7799 0.0594 4000000
90 90.00 -1.166048006780 -1.203820491599 0.037772484818 292.2210 0.0592 4000000
100 100.00 -1.166049793504 -1.206862845060 0.040813051556 315.7439 0.0595 4000000
Fri Feb 23 13:39:51 2024: Ending simulation
Simulation Validation:
Initial energy : -1.166063303459
Final energy : -1.166049793504
eFinal/eInitial : 0.999988
Final atom count : 4000000, no atoms lost
Timings for Rank 0
Timer # Calls Avg/Call (s) Total (s) % Loop
___________________________________________________________________
total 1 12.4301 12.4301 102.94
loop 1 12.0746 12.0746 100.00
timestep 10 1.2074 12.0740 100.00
position 100 0.0011 0.1111 0.92
velocity 200 0.0010 0.1936 1.60
redistribute 101 0.0621 6.2714 51.94
atomHalo 101 0.0223 2.2480 18.62
force 101 0.0554 5.5971 46.35
commHalo 303 0.0024 0.7146 5.92
commReduce 39 0.0003 0.0118 0.10
Timing Statistics Across 2 Ranks:
Timer Rank: Min(s) Rank: Max(s) Avg(s) Stdev(s)
_____________________________________________________________________________
total 0: 12.4301 1: 12.4302 12.4302 0.0000
loop 0: 12.0746 1: 12.0747 12.0746 0.0000
timestep 0: 12.0740 1: 12.0743 12.0742 0.0001
position 1: 0.0770 0: 0.1111 0.0941 0.0170
velocity 1: 0.1275 0: 0.1936 0.1605 0.0331
redistribute 0: 6.2714 1: 6.3731 6.3223 0.0508
atomHalo 1: 1.7287 0: 2.2480 1.9883 0.2596
force 1: 5.5793 0: 5.5971 5.5882 0.0089
commHalo 1: 0.1461 0: 0.7146 0.4304 0.2843
commReduce 0: 0.0118 1: 0.0276 0.0197 0.0079
---------------------------------------------------
Average atom update rate: 0.06 us/atom/task
---------------------------------------------------
---------------------------------------------------
Average all atom update rate: 0.03 us/atom
---------------------------------------------------
---------------------------------------------------
Average atom rate: 33.13 atoms/us
---------------------------------------------------
Fri Feb 23 13:39:51 2024: CoMD Ending
* Info: Process finished (host gmz11.benchmarkcenter.megware.com, process 12586)
* Warning: (host gmz11.benchmarkcenter.megware.com, process 12586) Observed more threads (65) than expected (64): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=65.
* Info: Process finished (host gmz11.benchmarkcenter.megware.com, process 12585)
* Warning: (host gmz11.benchmarkcenter.megware.com, process 12585) Observed more threads (65) than expected (64): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=65.
Your experiment path is /beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_6
To display your profiling results:
##################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
##################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_6 #
# Functions | Per-node | maqao lprof -df -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_6 #
# Functions | Per-process | maqao lprof -df -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_6 #
# Functions | Per-thread | maqao lprof -df -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_6 #
# Loops | Cluster-wide | maqao lprof -dl xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_6 #
# Loops | Per-node | maqao lprof -dl -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_6 #
# Loops | Per-process | maqao lprof -dl -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_6 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_6 #
##################################################################################################################################################################################################
* Info: Detected 2 Lprof instances in gmz11.benchmarkcenter.megware.com: processes-per-node/ppn set accordingly.
If this is incorrect, rerun with an explicit value for this setting
* Info: Selecting the 'perf-high-ppn' engine for node gmz11.benchmarkcenter.megware.com
* Info: "ref-cycles" not supported on gmz11.benchmarkcenter.megware.com: fallback to "cpu-clock"
* Warning: Found no event able to derive walltime: prepending cpu-clock
* Info: Process launched (host gmz11.benchmarkcenter.megware.com, process 12890)
* Info: Process launched (host gmz11.benchmarkcenter.megware.com, process 12889)Fri Feb 23 13:40:03 2024: Starting Initialization
Mini-Application Name : CoMD-openmp-mpi
Mini-Application Version : 1.1
Platform:
hostname: gmz16.benchmarkcenter.megware.com
kernel name: 'Linux'
kernel release: '5.14.0-362.13.1.el9_3.x86_64'
processor: 'x86_64'
Build:
CC: '/cluster/intel/oneapi/2024.0.0/mpi/2021.11/bin/mpiicc'
compiler version: 'unknown'
CFLAGS: '-O3 -march=native -DDO_MPI -O2 -axCORE-AVX512 -mprefer-vector-width=512 -g -grecord-gcc-switches -fno-omit-frame-pointer -fcf-protection=none -no-pie -cc=icx -fiopenmp'
LDFLAGS: ' '
using MPI: true
Threading: OpenMP (96 threads)
Double Precision: true
Run Date/Time: 2024-02-23, 13:40:03
Command Line Parameters:
doeam: 0
potDir: pots
potName: Cu_u6.eam
potType: funcfl
nx: 100
ny: 100
nz: 100
xproc: 2
yproc: 1
zproc: 1
Lattice constant: -1 Angstroms
nSteps: 100
printRate: 10
Time step: 1 fs
Initial Temperature: 600 K
Initial Delta: 0 Angstroms
Simulation data:
Total atoms : 4000000
Min global bounds : [ 0.0000000000, 0.0000000000, 0.0000000000 ]
Max global bounds : [ 361.5000000000, 361.5000000000, 361.5000000000 ]
Decomposition data:
Processors : 2, 1, 1
Local boxes : 31, 62, 62 = 119164
Box size : [ 5.8306451613, 5.8306451613, 5.8306451613 ]
Box factor : [ 1.0074548875, 1.0074548875, 1.0074548875 ]
Max Link Cell Occupancy: 32 of 64
Potential data:
Potential type : Lennard-Jones
Species name : Cu
Atomic number : 29
Mass : 63.55 amu
Lattice Type : FCC
Lattice spacing : 3.615 Angstroms
Cutoff : 5.7875 Angstroms
Epsilon : 0.167 eV
Sigma : 2.315 Angstroms
Initial energy : -1.166063303456, atom count : 4000000
Fri Feb 23 13:40:03 2024: Initialization Finished
Fri Feb 23 13:40:03 2024: Starting simulation
# Performance
# Loop Time(fs) Total Energy Potential Energy Kinetic Energy Temperature (us/atom) # Atoms
0 0.00 -1.166063303456 -1.243619295056 0.077555991600 600.0000 0.0000 4000000
10 10.00 -1.166059650500 -1.233157709949 0.067098059449 519.0938 0.0580 4000000
20 20.00 -1.166048438416 -1.208183014318 0.042134575902 325.9677 0.0579 4000000
30 30.00 -1.166037590737 -1.186586197151 0.020548606414 158.9711 0.0546 4000000
40 40.00 -1.166042093134 -1.183625399859 0.017583306724 136.0305 0.0541 4000000
50 50.00 -1.166051684893 -1.193713710258 0.027662025365 214.0030 0.0542 4000000
60 60.00 -1.166054646931 -1.202662201513 0.036607554582 283.2087 0.0548 4000000
70 70.00 -1.166052143011 -1.204911990844 0.038859847833 300.6332 0.0546 4000000
80 80.00 -1.166048803912 -1.203635015020 0.037586211108 290.7799 0.0543 4000000
90 90.00 -1.166048006780 -1.203820491599 0.037772484818 292.2210 0.0544 4000000
100 100.00 -1.166049793504 -1.206862845060 0.040813051556 315.7439 0.0543 4000000
Fri Feb 23 13:40:14 2024: Ending simulation
Simulation Validation:
Initial energy : -1.166063303456
Final energy : -1.166049793504
eFinal/eInitial : 0.999988
Final atom count : 4000000, no atoms lost
Timings for Rank 0
Timer # Calls Avg/Call (s) Total (s) % Loop
___________________________________________________________________
total 1 11.3797 11.3797 103.23
loop 1 11.0241 11.0241 100.00
timestep 10 1.1023 11.0233 99.99
position 100 0.0007 0.0684 0.62
velocity 200 0.0006 0.1117 1.01
redistribute 101 0.0677 6.8349 62.00
atomHalo 101 0.0285 2.8794 26.12
force 101 0.0406 4.1001 37.19
commHalo 303 0.0044 1.3360 12.12
commReduce 39 0.0003 0.0129 0.12
Timing Statistics Across 2 Ranks:
Timer Rank: Min(s) Rank: Max(s) Avg(s) Stdev(s)
_____________________________________________________________________________
total 1: 11.3797 0: 11.3797 11.3797 0.0000
loop 0: 11.0241 1: 11.0241 11.0241 0.0000
timestep 0: 11.0233 1: 11.0236 11.0235 0.0001
position 0: 0.0684 1: 0.0726 0.0705 0.0021
velocity 0: 0.1117 1: 0.1203 0.1160 0.0043
redistribute 0: 6.8349 1: 6.8920 6.8635 0.0285
atomHalo 1: 1.8706 0: 2.8794 2.3750 0.5044
force 1: 4.0356 0: 4.1001 4.0678 0.0323
commHalo 1: 0.0987 0: 1.3360 0.7174 0.6187
commReduce 1: 0.0053 0: 0.0129 0.0091 0.0038
---------------------------------------------------
Average atom update rate: 0.06 us/atom/task
---------------------------------------------------
---------------------------------------------------
Average all atom update rate: 0.03 us/atom
---------------------------------------------------
---------------------------------------------------
Average atom rate: 36.29 atoms/us
---------------------------------------------------
Fri Feb 23 13:40:14 2024: CoMD Ending
* Info: Process finished (host gmz11.benchmarkcenter.megware.com, process 12890)
* Warning: (host gmz11.benchmarkcenter.megware.com, process 12890) Observed more threads (97) than expected (96): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=97.
* Info: Process finished (host gmz11.benchmarkcenter.megware.com, process 12889)
* Warning: (host gmz11.benchmarkcenter.megware.com, process 12889) Observed more threads (97) than expected (96): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=97.
Your experiment path is /beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_7
To display your profiling results:
##################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
##################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_7 #
# Functions | Per-node | maqao lprof -df -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_7 #
# Functions | Per-process | maqao lprof -df -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_7 #
# Functions | Per-thread | maqao lprof -df -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_7 #
# Loops | Cluster-wide | maqao lprof -dl xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_7 #
# Loops | Per-node | maqao lprof -dl -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_7 #
# Loops | Per-process | maqao lprof -dl -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_7 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/icx_6/oneview_results_scal/tools/lprof_npsu_run_7 #
##################################################################################################################################################################################################