* Info: Detected 2 Lprof instances in gmz17.benchmarkcenter.megware.com: processes-per-node/ppn set accordingly.
If this is incorrect, rerun with an explicit value for this setting
* Info: Selecting the 'perf-high-ppn' engine for node gmz17.benchmarkcenter.megware.com
* Info: "ref-cycles" not supported on gmz17.benchmarkcenter.megware.com: fallback to "cpu-clock"
* Warning: Found no event able to derive walltime: prepending cpu-clock
* Info: Process launched (host gmz17.benchmarkcenter.megware.com, process 154345)
* Info: Process launched (host gmz17.benchmarkcenter.megware.com, process 154349)Thu Feb 22 15:09:27 2024: Starting Initialization
Mini-Application Name : CoMD-openmp-mpi
Mini-Application Version : 1.1
Platform:
hostname: gmz16.benchmarkcenter.megware.com
kernel name: 'Linux'
kernel release: '5.14.0-362.13.1.el9_3.x86_64'
processor: 'x86_64'
Build:
CC: '/cluster/intel/oneapi/2024.0.0/mpi/2021.11/bin/mpiicc'
compiler version: 'unknown'
CFLAGS: '-O3 -march=native -DDO_MPI -O3 -march=haswell -fno-tree-vectorize -fno-openmp-simd -funroll-loops -ffast-math -g -grecord-gcc-switches -fno-omit-frame-pointer -fcf-protection=none -no-pie -cc=gcc -fopenmp -funroll-loops'
LDFLAGS: ' '
using MPI: true
Threading: OpenMP (1 threads)
Double Precision: true
Run Date/Time: 2024-02-22, 15:09:27
Command Line Parameters:
doeam: 0
potDir: pots
potName: Cu_u6.eam
potType: funcfl
nx: 100
ny: 100
nz: 100
xproc: 2
yproc: 1
zproc: 1
Lattice constant: -1 Angstroms
nSteps: 100
printRate: 10
Time step: 1 fs
Initial Temperature: 600 K
Initial Delta: 0 Angstroms
Simulation data:
Total atoms : 4000000
Min global bounds : [ 0.0000000000, 0.0000000000, 0.0000000000 ]
Max global bounds : [ 361.5000000000, 361.5000000000, 361.5000000000 ]
Decomposition data:
Processors : 2, 1, 1
Local boxes : 31, 62, 62 = 119164
Box size : [ 5.8306451613, 5.8306451613, 5.8306451613 ]
Box factor : [ 1.0074548875, 1.0074548875, 1.0074548875 ]
Max Link Cell Occupancy: 32 of 64
Potential data:
Potential type : Lennard-Jones
Species name : Cu
Atomic number : 29
Mass : 63.55 amu
Lattice Type : FCC
Lattice spacing : 3.615 Angstroms
Cutoff : 5.7875 Angstroms
Epsilon : 0.167 eV
Sigma : 2.315 Angstroms
Initial energy : -1.166063303588, atom count : 4000000
Thu Feb 22 15:09:30 2024: Initialization Finished
Thu Feb 22 15:09:30 2024: Starting simulation
# Performance
# Loop Time(fs) Total Energy Potential Energy Kinetic Energy Temperature (us/atom) # Atoms
0 0.00 -1.166063303588 -1.243619295188 0.077555991600 600.0000 0.0000 4000000
10 10.00 -1.166059650499 -1.233157709948 0.067098059449 519.0938 0.9940 4000000
20 20.00 -1.166048438417 -1.208183014318 0.042134575902 325.9677 1.3385 4000000
30 30.00 -1.166037590737 -1.186586197151 0.020548606414 158.9711 1.4567 4000000
40 40.00 -1.166042093135 -1.183625399859 0.017583306724 136.0305 1.4930 4000000
50 50.00 -1.166051684893 -1.193713710258 0.027662025365 214.0030 1.5032 4000000
60 60.00 -1.166054646931 -1.202662201513 0.036607554582 283.2087 1.5019 4000000
70 70.00 -1.166052143011 -1.204911990844 0.038859847833 300.6332 1.4965 4000000
80 80.00 -1.166048803912 -1.203635015020 0.037586211108 290.7799 1.4891 4000000
90 90.00 -1.166048006780 -1.203820491598 0.037772484818 292.2210 1.4813 4000000
100 100.00 -1.166049793505 -1.206862845061 0.040813051556 315.7439 1.4728 4000000
Thu Feb 22 15:14:14 2024: Ending simulation
Simulation Validation:
Initial energy : -1.166063303588
Final energy : -1.166049793505
eFinal/eInitial : 0.999988
Final atom count : 4000000, no atoms lost
Timings for Rank 0
Timer # Calls Avg/Call (s) Total (s) % Loop
___________________________________________________________________
total 1 286.7466 286.7466 100.78
loop 1 284.5411 284.5411 100.00
timestep 10 28.4541 284.5405 100.00
position 100 0.0186 1.8581 0.65
velocity 200 0.0170 3.3989 1.19
redistribute 101 0.0915 9.2421 3.25
atomHalo 101 0.0100 1.0101 0.35
force 101 2.6893 271.6169 95.46
commHalo 303 0.0007 0.2035 0.07
commReduce 39 0.0005 0.0192 0.01
Timing Statistics Across 2 Ranks:
Timer Rank: Min(s) Rank: Max(s) Avg(s) Stdev(s)
_____________________________________________________________________________
total 0: 286.7466 1: 286.7467 286.7467 0.0001
loop 0: 284.5411 1: 284.5411 284.5411 0.0000
timestep 0: 284.5405 1: 284.5407 284.5406 0.0001
position 1: 1.8572 0: 1.8581 1.8576 0.0004
velocity 1: 3.3901 0: 3.3989 3.3945 0.0044
redistribute 0: 9.2421 1: 9.2655 9.2538 0.0117
atomHalo 0: 1.0101 1: 1.0597 1.0349 0.0248
force 1: 271.5985 0: 271.6169 271.6077 0.0092
commHalo 0: 0.2035 1: 0.2474 0.2254 0.0219
commReduce 0: 0.0192 1: 0.0198 0.0195 0.0003
---------------------------------------------------
Average atom update rate: 1.42 us/atom/task
---------------------------------------------------
---------------------------------------------------
Average all atom update rate: 0.71 us/atom
---------------------------------------------------
---------------------------------------------------
Average atom rate: 1.41 atoms/us
---------------------------------------------------
Thu Feb 22 15:14:14 2024: CoMD Ending
* Info: Process finished (host gmz17.benchmarkcenter.megware.com, process 154345)
* Warning: (host gmz17.benchmarkcenter.megware.com, process 154345) Observed more threads (2) than expected (1): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=2.
* Info: Process finished (host gmz17.benchmarkcenter.megware.com, process 154349)
* Warning: (host gmz17.benchmarkcenter.megware.com, process 154349) Observed more threads (2) than expected (1): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=2.
Your experiment path is /beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_0
To display your profiling results:
##################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
##################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_0 #
# Functions | Per-node | maqao lprof -df -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_0 #
# Functions | Per-process | maqao lprof -df -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_0 #
# Functions | Per-thread | maqao lprof -df -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_0 #
# Loops | Cluster-wide | maqao lprof -dl xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_0 #
# Loops | Per-node | maqao lprof -dl -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_0 #
# Loops | Per-process | maqao lprof -dl -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_0 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_0 #
##################################################################################################################################################################################################
* Info: Detected 2 Lprof instances in gmz17.benchmarkcenter.megware.com: processes-per-node/ppn set accordingly.
If this is incorrect, rerun with an explicit value for this setting
* Info: Selecting the 'perf-high-ppn' engine for node gmz17.benchmarkcenter.megware.com
* Info: "ref-cycles" not supported on gmz17.benchmarkcenter.megware.com: fallback to "cpu-clock"
* Warning: Found no event able to derive walltime: prepending cpu-clock
* Info: Process launched (host gmz17.benchmarkcenter.megware.com, process 154439)
* Info: Process launched (host gmz17.benchmarkcenter.megware.com, process 154438)Thu Feb 22 15:15:14 2024: Starting Initialization
Mini-Application Name : CoMD-openmp-mpi
Mini-Application Version : 1.1
Platform:
hostname: gmz16.benchmarkcenter.megware.com
kernel name: 'Linux'
kernel release: '5.14.0-362.13.1.el9_3.x86_64'
processor: 'x86_64'
Build:
CC: '/cluster/intel/oneapi/2024.0.0/mpi/2021.11/bin/mpiicc'
compiler version: 'unknown'
CFLAGS: '-O3 -march=native -DDO_MPI -O3 -march=haswell -fno-tree-vectorize -fno-openmp-simd -funroll-loops -ffast-math -g -grecord-gcc-switches -fno-omit-frame-pointer -fcf-protection=none -no-pie -cc=gcc -fopenmp -funroll-loops'
LDFLAGS: ' '
using MPI: true
Threading: OpenMP (2 threads)
Double Precision: true
Run Date/Time: 2024-02-22, 15:15:14
Command Line Parameters:
doeam: 0
potDir: pots
potName: Cu_u6.eam
potType: funcfl
nx: 100
ny: 100
nz: 100
xproc: 2
yproc: 1
zproc: 1
Lattice constant: -1 Angstroms
nSteps: 100
printRate: 10
Time step: 1 fs
Initial Temperature: 600 K
Initial Delta: 0 Angstroms
Simulation data:
Total atoms : 4000000
Min global bounds : [ 0.0000000000, 0.0000000000, 0.0000000000 ]
Max global bounds : [ 361.5000000000, 361.5000000000, 361.5000000000 ]
Decomposition data:
Processors : 2, 1, 1
Local boxes : 31, 62, 62 = 119164
Box size : [ 5.8306451613, 5.8306451613, 5.8306451613 ]
Box factor : [ 1.0074548875, 1.0074548875, 1.0074548875 ]
Max Link Cell Occupancy: 32 of 64
Potential data:
Potential type : Lennard-Jones
Species name : Cu
Atomic number : 29
Mass : 63.55 amu
Lattice Type : FCC
Lattice spacing : 3.615 Angstroms
Cutoff : 5.7875 Angstroms
Epsilon : 0.167 eV
Sigma : 2.315 Angstroms
Initial energy : -1.166063304092, atom count : 4000000
Thu Feb 22 15:15:15 2024: Initialization Finished
Thu Feb 22 15:15:15 2024: Starting simulation
# Performance
# Loop Time(fs) Total Energy Potential Energy Kinetic Energy Temperature (us/atom) # Atoms
0 0.00 -1.166063304092 -1.243619295692 0.077555991600 600.0000 0.0000 4000000
10 10.00 -1.166059650500 -1.233157709949 0.067098059449 519.0938 0.5222 4000000
20 20.00 -1.166048438416 -1.208183014318 0.042134575902 325.9677 0.6924 4000000
30 30.00 -1.166037590737 -1.186586197151 0.020548606414 158.9711 0.7532 4000000
40 40.00 -1.166042093134 -1.183625399859 0.017583306724 136.0305 0.7727 4000000
50 50.00 -1.166051684893 -1.193713710258 0.027662025365 214.0030 0.7784 4000000
60 60.00 -1.166054646931 -1.202662201513 0.036607554582 283.2087 0.7789 4000000
70 70.00 -1.166052143011 -1.204911990844 0.038859847833 300.6332 0.7758 4000000
80 80.00 -1.166048803912 -1.203635015020 0.037586211108 290.7799 0.7672 4000000
90 90.00 -1.166048006780 -1.203820491599 0.037772484818 292.2210 0.7652 4000000
100 100.00 -1.166049793504 -1.206862845060 0.040813051556 315.7439 0.7619 4000000
Thu Feb 22 15:17:42 2024: Ending simulation
Simulation Validation:
Initial energy : -1.166063304092
Final energy : -1.166049793504
eFinal/eInitial : 0.999988
Final atom count : 4000000, no atoms lost
Timings for Rank 0
Timer # Calls Avg/Call (s) Total (s) % Loop
___________________________________________________________________
total 1 148.6388 148.6388 100.87
loop 1 147.3551 147.3551 100.00
timestep 10 14.7354 147.3545 100.00
position 100 0.0090 0.9039 0.61
velocity 200 0.0082 1.6388 1.11
redistribute 101 0.0949 9.5893 6.51
atomHalo 101 0.0356 3.5981 2.44
force 101 1.3457 135.9139 92.24
commHalo 303 0.0091 2.7699 1.88
commReduce 39 0.0054 0.2122 0.14
Timing Statistics Across 2 Ranks:
Timer Rank: Min(s) Rank: Max(s) Avg(s) Stdev(s)
_____________________________________________________________________________
total 0: 148.6388 1: 148.6390 148.6389 0.0001
loop 0: 147.3551 1: 147.3551 147.3551 0.0000
timestep 0: 147.3545 1: 147.3547 147.3546 0.0001
position 0: 0.9039 1: 0.9354 0.9197 0.0157
velocity 0: 1.6388 1: 1.7705 1.7046 0.0658
redistribute 1: 9.1408 0: 9.5893 9.3650 0.2242
atomHalo 1: 1.1322 0: 3.5981 2.3652 1.2329
force 0: 135.9139 1: 136.3759 136.1449 0.2310
commHalo 1: 0.1395 0: 2.7699 1.4547 1.3152
commReduce 1: 0.0006 0: 0.2122 0.1064 0.1058
---------------------------------------------------
Average atom update rate: 0.74 us/atom/task
---------------------------------------------------
---------------------------------------------------
Average all atom update rate: 0.37 us/atom
---------------------------------------------------
---------------------------------------------------
Average atom rate: 2.71 atoms/us
---------------------------------------------------
Thu Feb 22 15:17:42 2024: CoMD Ending
* Info: Process finished (host gmz17.benchmarkcenter.megware.com, process 154438)
* Warning: (host gmz17.benchmarkcenter.megware.com, process 154438) Observed more threads (3) than expected (2): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=3.
* Info: Process finished (host gmz17.benchmarkcenter.megware.com, process 154439)
* Warning: (host gmz17.benchmarkcenter.megware.com, process 154439) Observed more threads (3) than expected (2): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=3.
Your experiment path is /beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_1
To display your profiling results:
##################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
##################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_1 #
# Functions | Per-node | maqao lprof -df -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_1 #
# Functions | Per-process | maqao lprof -df -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_1 #
# Functions | Per-thread | maqao lprof -df -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_1 #
# Loops | Cluster-wide | maqao lprof -dl xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_1 #
# Loops | Per-node | maqao lprof -dl -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_1 #
# Loops | Per-process | maqao lprof -dl -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_1 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_1 #
##################################################################################################################################################################################################
* Info: Detected 2 Lprof instances in gmz17.benchmarkcenter.megware.com: processes-per-node/ppn set accordingly.
If this is incorrect, rerun with an explicit value for this setting
* Info: Selecting the 'perf-high-ppn' engine for node gmz17.benchmarkcenter.megware.com
* Info: "ref-cycles" not supported on gmz17.benchmarkcenter.megware.com: fallback to "cpu-clock"
* Warning: Found no event able to derive walltime: prepending cpu-clock
* Info: Process launched (host gmz17.benchmarkcenter.megware.com, process 154525)
* Info: Process launched (host gmz17.benchmarkcenter.megware.com, process 154526)Thu Feb 22 15:17:54 2024: Starting Initialization
Mini-Application Name : CoMD-openmp-mpi
Mini-Application Version : 1.1
Platform:
hostname: gmz16.benchmarkcenter.megware.com
kernel name: 'Linux'
kernel release: '5.14.0-362.13.1.el9_3.x86_64'
processor: 'x86_64'
Build:
CC: '/cluster/intel/oneapi/2024.0.0/mpi/2021.11/bin/mpiicc'
compiler version: 'unknown'
CFLAGS: '-O3 -march=native -DDO_MPI -O3 -march=haswell -fno-tree-vectorize -fno-openmp-simd -funroll-loops -ffast-math -g -grecord-gcc-switches -fno-omit-frame-pointer -fcf-protection=none -no-pie -cc=gcc -fopenmp -funroll-loops'
LDFLAGS: ' '
using MPI: true
Threading: OpenMP (4 threads)
Double Precision: true
Run Date/Time: 2024-02-22, 15:17:54
Command Line Parameters:
doeam: 0
potDir: pots
potName: Cu_u6.eam
potType: funcfl
nx: 100
ny: 100
nz: 100
xproc: 2
yproc: 1
zproc: 1
Lattice constant: -1 Angstroms
nSteps: 100
printRate: 10
Time step: 1 fs
Initial Temperature: 600 K
Initial Delta: 0 Angstroms
Simulation data:
Total atoms : 4000000
Min global bounds : [ 0.0000000000, 0.0000000000, 0.0000000000 ]
Max global bounds : [ 361.5000000000, 361.5000000000, 361.5000000000 ]
Decomposition data:
Processors : 2, 1, 1
Local boxes : 31, 62, 62 = 119164
Box size : [ 5.8306451613, 5.8306451613, 5.8306451613 ]
Box factor : [ 1.0074548875, 1.0074548875, 1.0074548875 ]
Max Link Cell Occupancy: 32 of 64
Potential data:
Potential type : Lennard-Jones
Species name : Cu
Atomic number : 29
Mass : 63.55 amu
Lattice Type : FCC
Lattice spacing : 3.615 Angstroms
Cutoff : 5.7875 Angstroms
Epsilon : 0.167 eV
Sigma : 2.315 Angstroms
Initial energy : -1.166063303067, atom count : 4000000
Thu Feb 22 15:17:54 2024: Initialization Finished
Thu Feb 22 15:17:54 2024: Starting simulation
# Performance
# Loop Time(fs) Total Energy Potential Energy Kinetic Energy Temperature (us/atom) # Atoms
0 0.00 -1.166063303067 -1.243619294667 0.077555991600 600.0000 0.0000 4000000
10 10.00 -1.166059650499 -1.233157709949 0.067098059449 519.0938 0.2832 4000000
20 20.00 -1.166048438416 -1.208183014318 0.042134575902 325.9677 0.3607 4000000
30 30.00 -1.166037590737 -1.186586197151 0.020548606414 158.9711 0.3897 4000000
40 40.00 -1.166042093134 -1.183625399859 0.017583306724 136.0305 0.3990 4000000
50 50.00 -1.166051684893 -1.193713710257 0.027662025365 214.0030 0.4011 4000000
60 60.00 -1.166054646931 -1.202662201513 0.036607554582 283.2087 0.4013 4000000
70 70.00 -1.166052143011 -1.204911990845 0.038859847833 300.6332 0.4001 4000000
80 80.00 -1.166048803912 -1.203635015020 0.037586211108 290.7799 0.3983 4000000
90 90.00 -1.166048006780 -1.203820491599 0.037772484818 292.2210 0.4001 4000000
100 100.00 -1.166049793504 -1.206862845060 0.040813051556 315.7439 0.3978 4000000
Thu Feb 22 15:19:11 2024: Ending simulation
Simulation Validation:
Initial energy : -1.166063303067
Final energy : -1.166049793504
eFinal/eInitial : 0.999988
Final atom count : 4000000, no atoms lost
Timings for Rank 0
Timer # Calls Avg/Call (s) Total (s) % Loop
___________________________________________________________________
total 1 77.3916 77.3916 100.99
loop 1 76.6298 76.6298 100.00
timestep 10 7.6629 76.6292 100.00
position 100 0.0046 0.4611 0.60
velocity 200 0.0040 0.7976 1.04
redistribute 101 0.0745 7.5253 9.82
atomHalo 101 0.0208 2.0988 2.74
force 101 0.6763 68.3091 89.14
commHalo 303 0.0038 1.1384 1.49
commReduce 39 0.0005 0.0201 0.03
Timing Statistics Across 2 Ranks:
Timer Rank: Min(s) Rank: Max(s) Avg(s) Stdev(s)
_____________________________________________________________________________
total 0: 77.3916 1: 77.3920 77.3918 0.0002
loop 0: 76.6298 1: 76.6298 76.6298 0.0000
timestep 0: 76.6292 1: 76.6294 76.6293 0.0001
position 0: 0.4611 1: 0.4670 0.4640 0.0029
velocity 0: 0.7976 1: 0.8155 0.8065 0.0090
redistribute 1: 7.5239 0: 7.5253 7.5246 0.0007
atomHalo 1: 1.1657 0: 2.0988 1.6323 0.4666
force 1: 68.2965 0: 68.3091 68.3028 0.0063
commHalo 1: 0.1171 0: 1.1384 0.6278 0.5106
commReduce 1: 0.0070 0: 0.0201 0.0135 0.0065
---------------------------------------------------
Average atom update rate: 0.38 us/atom/task
---------------------------------------------------
---------------------------------------------------
Average all atom update rate: 0.19 us/atom
---------------------------------------------------
---------------------------------------------------
Average atom rate: 5.22 atoms/us
---------------------------------------------------
Thu Feb 22 15:19:11 2024: CoMD Ending
* Info: Process finished (host gmz17.benchmarkcenter.megware.com, process 154525)
* Warning: (host gmz17.benchmarkcenter.megware.com, process 154525) Observed more threads (5) than expected (4): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=5.
* Info: Process finished (host gmz17.benchmarkcenter.megware.com, process 154526)
* Warning: (host gmz17.benchmarkcenter.megware.com, process 154526) Observed more threads (5) than expected (4): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=5.
Your experiment path is /beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_2
To display your profiling results:
##################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
##################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_2 #
# Functions | Per-node | maqao lprof -df -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_2 #
# Functions | Per-process | maqao lprof -df -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_2 #
# Functions | Per-thread | maqao lprof -df -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_2 #
# Loops | Cluster-wide | maqao lprof -dl xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_2 #
# Loops | Per-node | maqao lprof -dl -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_2 #
# Loops | Per-process | maqao lprof -dl -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_2 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_2 #
##################################################################################################################################################################################################
* Info: Detected 2 Lprof instances in gmz17.benchmarkcenter.megware.com: processes-per-node/ppn set accordingly.
If this is incorrect, rerun with an explicit value for this setting
* Info: Selecting the 'perf-high-ppn' engine for node gmz17.benchmarkcenter.megware.com
* Info: "ref-cycles" not supported on gmz17.benchmarkcenter.megware.com: fallback to "cpu-clock"
* Warning: Found no event able to derive walltime: prepending cpu-clock
* Info: Process launched (host gmz17.benchmarkcenter.megware.com, process 154618)
* Info: Process launched (host gmz17.benchmarkcenter.megware.com, process 154622)Thu Feb 22 15:19:21 2024: Starting Initialization
Mini-Application Name : CoMD-openmp-mpi
Mini-Application Version : 1.1
Platform:
hostname: gmz16.benchmarkcenter.megware.com
kernel name: 'Linux'
kernel release: '5.14.0-362.13.1.el9_3.x86_64'
processor: 'x86_64'
Build:
CC: '/cluster/intel/oneapi/2024.0.0/mpi/2021.11/bin/mpiicc'
compiler version: 'unknown'
CFLAGS: '-O3 -march=native -DDO_MPI -O3 -march=haswell -fno-tree-vectorize -fno-openmp-simd -funroll-loops -ffast-math -g -grecord-gcc-switches -fno-omit-frame-pointer -fcf-protection=none -no-pie -cc=gcc -fopenmp -funroll-loops'
LDFLAGS: ' '
using MPI: true
Threading: OpenMP (8 threads)
Double Precision: true
Run Date/Time: 2024-02-22, 15:19:21
Command Line Parameters:
doeam: 0
potDir: pots
potName: Cu_u6.eam
potType: funcfl
nx: 100
ny: 100
nz: 100
xproc: 2
yproc: 1
zproc: 1
Lattice constant: -1 Angstroms
nSteps: 100
printRate: 10
Time step: 1 fs
Initial Temperature: 600 K
Initial Delta: 0 Angstroms
Simulation data:
Total atoms : 4000000
Min global bounds : [ 0.0000000000, 0.0000000000, 0.0000000000 ]
Max global bounds : [ 361.5000000000, 361.5000000000, 361.5000000000 ]
Decomposition data:
Processors : 2, 1, 1
Local boxes : 31, 62, 62 = 119164
Box size : [ 5.8306451613, 5.8306451613, 5.8306451613 ]
Box factor : [ 1.0074548875, 1.0074548875, 1.0074548875 ]
Max Link Cell Occupancy: 32 of 64
Potential data:
Potential type : Lennard-Jones
Species name : Cu
Atomic number : 29
Mass : 63.55 amu
Lattice Type : FCC
Lattice spacing : 3.615 Angstroms
Cutoff : 5.7875 Angstroms
Epsilon : 0.167 eV
Sigma : 2.315 Angstroms
Initial energy : -1.166063303663, atom count : 4000000
Thu Feb 22 15:19:22 2024: Initialization Finished
Thu Feb 22 15:19:22 2024: Starting simulation
# Performance
# Loop Time(fs) Total Energy Potential Energy Kinetic Energy Temperature (us/atom) # Atoms
0 0.00 -1.166063303663 -1.243619295263 0.077555991600 600.0000 0.0000 4000000
10 10.00 -1.166059650500 -1.233157709949 0.067098059449 519.0938 0.1672 4000000
20 20.00 -1.166048438416 -1.208183014318 0.042134575902 325.9677 0.2083 4000000
30 30.00 -1.166037590737 -1.186586197151 0.020548606414 158.9711 0.2201 4000000
40 40.00 -1.166042093134 -1.183625399859 0.017583306724 136.0305 0.2232 4000000
50 50.00 -1.166051684893 -1.193713710258 0.027662025365 214.0030 0.2243 4000000
60 60.00 -1.166054646931 -1.202662201513 0.036607554582 283.2087 0.2235 4000000
70 70.00 -1.166052143011 -1.204911990845 0.038859847833 300.6332 0.2205 4000000
80 80.00 -1.166048803912 -1.203635015020 0.037586211108 290.7799 0.2191 4000000
90 90.00 -1.166048006780 -1.203820491599 0.037772484818 292.2210 0.2180 4000000
100 100.00 -1.166049793504 -1.206862845060 0.040813051556 315.7439 0.2168 4000000
Thu Feb 22 15:20:05 2024: Ending simulation
Simulation Validation:
Initial energy : -1.166063303663
Final energy : -1.166049793504
eFinal/eInitial : 0.999988
Final atom count : 4000000, no atoms lost
Timings for Rank 0
Timer # Calls Avg/Call (s) Total (s) % Loop
___________________________________________________________________
total 1 43.3447 43.3447 101.22
loop 1 42.8219 42.8219 100.00
timestep 10 4.2821 42.8213 100.00
position 100 0.0023 0.2298 0.54
velocity 200 0.0019 0.3814 0.89
redistribute 101 0.0711 7.1813 16.77
atomHalo 101 0.0240 2.4290 5.67
force 101 0.3495 35.2965 82.43
commHalo 303 0.0047 1.4297 3.34
commReduce 39 0.0011 0.0431 0.10
Timing Statistics Across 2 Ranks:
Timer Rank: Min(s) Rank: Max(s) Avg(s) Stdev(s)
_____________________________________________________________________________
total 0: 43.3447 1: 43.3449 43.3448 0.0001
loop 0: 42.8219 1: 42.8219 42.8219 0.0000
timestep 0: 42.8213 1: 42.8215 42.8214 0.0001
position 0: 0.2298 1: 0.2602 0.2450 0.0152
velocity 0: 0.3814 1: 0.4150 0.3982 0.0168
redistribute 0: 7.1813 1: 7.2442 7.2127 0.0314
atomHalo 1: 1.2681 0: 2.4290 1.8486 0.5805
force 1: 35.1713 0: 35.2965 35.2339 0.0626
commHalo 1: 0.1200 0: 1.4297 0.7748 0.6548
commReduce 1: 0.0160 0: 0.0431 0.0295 0.0135
---------------------------------------------------
Average atom update rate: 0.21 us/atom/task
---------------------------------------------------
---------------------------------------------------
Average all atom update rate: 0.11 us/atom
---------------------------------------------------
---------------------------------------------------
Average atom rate: 9.34 atoms/us
---------------------------------------------------
Thu Feb 22 15:20:05 2024: CoMD Ending
* Info: Process finished (host gmz17.benchmarkcenter.megware.com, process 154622)
* Warning: (host gmz17.benchmarkcenter.megware.com, process 154622) Observed more threads (9) than expected (8): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=9.
* Info: Process finished (host gmz17.benchmarkcenter.megware.com, process 154618)
* Warning: (host gmz17.benchmarkcenter.megware.com, process 154618) Observed more threads (9) than expected (8): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=9.
Your experiment path is /beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_3
To display your profiling results:
##################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
##################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_3 #
# Functions | Per-node | maqao lprof -df -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_3 #
# Functions | Per-process | maqao lprof -df -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_3 #
# Functions | Per-thread | maqao lprof -df -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_3 #
# Loops | Cluster-wide | maqao lprof -dl xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_3 #
# Loops | Per-node | maqao lprof -dl -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_3 #
# Loops | Per-process | maqao lprof -dl -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_3 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_3 #
##################################################################################################################################################################################################
* Info: Detected 2 Lprof instances in gmz17.benchmarkcenter.megware.com: processes-per-node/ppn set accordingly.
If this is incorrect, rerun with an explicit value for this setting
* Info: Selecting the 'perf-high-ppn' engine for node gmz17.benchmarkcenter.megware.com
* Info: "ref-cycles" not supported on gmz17.benchmarkcenter.megware.com: fallback to "cpu-clock"
* Warning: Found no event able to derive walltime: prepending cpu-clock
* Info: Process launched (host gmz17.benchmarkcenter.megware.com, process 154700)
* Info: Process launched (host gmz17.benchmarkcenter.megware.com, process 154701)Thu Feb 22 15:21:10 2024: Starting Initialization
Mini-Application Name : CoMD-openmp-mpi
Mini-Application Version : 1.1
Platform:
hostname: gmz16.benchmarkcenter.megware.com
kernel name: 'Linux'
kernel release: '5.14.0-362.13.1.el9_3.x86_64'
processor: 'x86_64'
Build:
CC: '/cluster/intel/oneapi/2024.0.0/mpi/2021.11/bin/mpiicc'
compiler version: 'unknown'
CFLAGS: '-O3 -march=native -DDO_MPI -O3 -march=haswell -fno-tree-vectorize -fno-openmp-simd -funroll-loops -ffast-math -g -grecord-gcc-switches -fno-omit-frame-pointer -fcf-protection=none -no-pie -cc=gcc -fopenmp -funroll-loops'
LDFLAGS: ' '
using MPI: true
Threading: OpenMP (16 threads)
Double Precision: true
Run Date/Time: 2024-02-22, 15:21:10
Command Line Parameters:
doeam: 0
potDir: pots
potName: Cu_u6.eam
potType: funcfl
nx: 100
ny: 100
nz: 100
xproc: 2
yproc: 1
zproc: 1
Lattice constant: -1 Angstroms
nSteps: 100
printRate: 10
Time step: 1 fs
Initial Temperature: 600 K
Initial Delta: 0 Angstroms
Simulation data:
Total atoms : 4000000
Min global bounds : [ 0.0000000000, 0.0000000000, 0.0000000000 ]
Max global bounds : [ 361.5000000000, 361.5000000000, 361.5000000000 ]
Decomposition data:
Processors : 2, 1, 1
Local boxes : 31, 62, 62 = 119164
Box size : [ 5.8306451613, 5.8306451613, 5.8306451613 ]
Box factor : [ 1.0074548875, 1.0074548875, 1.0074548875 ]
Max Link Cell Occupancy: 32 of 64
Potential data:
Potential type : Lennard-Jones
Species name : Cu
Atomic number : 29
Mass : 63.55 amu
Lattice Type : FCC
Lattice spacing : 3.615 Angstroms
Cutoff : 5.7875 Angstroms
Epsilon : 0.167 eV
Sigma : 2.315 Angstroms
Initial energy : -1.166063303487, atom count : 4000000
Thu Feb 22 15:21:11 2024: Initialization Finished
Thu Feb 22 15:21:11 2024: Starting simulation
# Performance
# Loop Time(fs) Total Energy Potential Energy Kinetic Energy Temperature (us/atom) # Atoms
0 0.00 -1.166063303487 -1.243619295087 0.077555991600 600.0000 0.0000 4000000
10 10.00 -1.166059650500 -1.233157709949 0.067098059449 519.0938 0.1065 4000000
20 20.00 -1.166048438416 -1.208183014318 0.042134575902 325.9677 0.1278 4000000
30 30.00 -1.166037590737 -1.186586197151 0.020548606414 158.9711 0.1279 4000000
40 40.00 -1.166042093134 -1.183625399859 0.017583306724 136.0305 0.1306 4000000
50 50.00 -1.166051684893 -1.193713710257 0.027662025365 214.0030 0.1312 4000000
60 60.00 -1.166054646931 -1.202662201513 0.036607554582 283.2087 0.1284 4000000
70 70.00 -1.166052143011 -1.204911990844 0.038859847833 300.6332 0.1274 4000000
80 80.00 -1.166048803912 -1.203635015020 0.037586211108 290.7799 0.1270 4000000
90 90.00 -1.166048006780 -1.203820491599 0.037772484818 292.2210 0.1262 4000000
100 100.00 -1.166049793504 -1.206862845060 0.040813051556 315.7439 0.1255 4000000
Thu Feb 22 15:21:36 2024: Ending simulation
Simulation Validation:
Initial energy : -1.166063303487
Final energy : -1.166049793504
eFinal/eInitial : 0.999988
Final atom count : 4000000, no atoms lost
Timings for Rank 0
Timer # Calls Avg/Call (s) Total (s) % Loop
___________________________________________________________________
total 1 25.5495 25.5495 101.51
loop 1 25.1698 25.1698 100.00
timestep 10 2.5169 25.1692 100.00
position 100 0.0015 0.1459 0.58
velocity 200 0.0012 0.2359 0.94
redistribute 101 0.0657 6.6370 26.37
atomHalo 101 0.0257 2.5911 10.29
force 101 0.1812 18.2986 72.70
commHalo 303 0.0046 1.3955 5.54
commReduce 39 0.0007 0.0268 0.11
Timing Statistics Across 2 Ranks:
Timer Rank: Min(s) Rank: Max(s) Avg(s) Stdev(s)
_____________________________________________________________________________
total 1: 25.5494 0: 25.5495 25.5495 0.0001
loop 0: 25.1698 1: 25.1698 25.1698 0.0000
timestep 0: 25.1692 1: 25.1694 25.1693 0.0001
position 0: 0.1459 1: 0.1496 0.1478 0.0018
velocity 0: 0.2359 1: 0.2687 0.2523 0.0164
redistribute 1: 6.6242 0: 6.6370 6.6306 0.0064
atomHalo 1: 1.5182 0: 2.5911 2.0547 0.5364
force 1: 18.2960 0: 18.2986 18.2973 0.0013
commHalo 1: 0.1201 0: 1.3955 0.7578 0.6377
commReduce 1: 0.0042 0: 0.0268 0.0155 0.0113
---------------------------------------------------
Average atom update rate: 0.13 us/atom/task
---------------------------------------------------
---------------------------------------------------
Average all atom update rate: 0.06 us/atom
---------------------------------------------------
---------------------------------------------------
Average atom rate: 15.89 atoms/us
---------------------------------------------------
Thu Feb 22 15:21:36 2024: CoMD Ending
* Info: Process finished (host gmz17.benchmarkcenter.megware.com, process 154701)
* Warning: (host gmz17.benchmarkcenter.megware.com, process 154701) Observed more threads (17) than expected (16): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=17.
* Info: Process finished (host gmz17.benchmarkcenter.megware.com, process 154700)
* Warning: (host gmz17.benchmarkcenter.megware.com, process 154700) Observed more threads (17) than expected (16): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=17.
Your experiment path is /beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_4
To display your profiling results:
##################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
##################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_4 #
# Functions | Per-node | maqao lprof -df -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_4 #
# Functions | Per-process | maqao lprof -df -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_4 #
# Functions | Per-thread | maqao lprof -df -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_4 #
# Loops | Cluster-wide | maqao lprof -dl xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_4 #
# Loops | Per-node | maqao lprof -dl -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_4 #
# Loops | Per-process | maqao lprof -dl -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_4 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_4 #
##################################################################################################################################################################################################
* Info: Detected 2 Lprof instances in gmz17.benchmarkcenter.megware.com: processes-per-node/ppn set accordingly.
If this is incorrect, rerun with an explicit value for this setting
* Info: Selecting the 'perf-high-ppn' engine for node gmz17.benchmarkcenter.megware.com
* Info: "ref-cycles" not supported on gmz17.benchmarkcenter.megware.com: fallback to "cpu-clock"
* Warning: Found no event able to derive walltime: prepending cpu-clock
* Info: Process launched (host gmz17.benchmarkcenter.megware.com, process 154838)
* Info: Process launched (host gmz17.benchmarkcenter.megware.com, process 154837)Thu Feb 22 15:21:47 2024: Starting Initialization
Mini-Application Name : CoMD-openmp-mpi
Mini-Application Version : 1.1
Platform:
hostname: gmz16.benchmarkcenter.megware.com
kernel name: 'Linux'
kernel release: '5.14.0-362.13.1.el9_3.x86_64'
processor: 'x86_64'
Build:
CC: '/cluster/intel/oneapi/2024.0.0/mpi/2021.11/bin/mpiicc'
compiler version: 'unknown'
CFLAGS: '-O3 -march=native -DDO_MPI -O3 -march=haswell -fno-tree-vectorize -fno-openmp-simd -funroll-loops -ffast-math -g -grecord-gcc-switches -fno-omit-frame-pointer -fcf-protection=none -no-pie -cc=gcc -fopenmp -funroll-loops'
LDFLAGS: ' '
using MPI: true
Threading: OpenMP (32 threads)
Double Precision: true
Run Date/Time: 2024-02-22, 15:21:47
Command Line Parameters:
doeam: 0
potDir: pots
potName: Cu_u6.eam
potType: funcfl
nx: 100
ny: 100
nz: 100
xproc: 2
yproc: 1
zproc: 1
Lattice constant: -1 Angstroms
nSteps: 100
printRate: 10
Time step: 1 fs
Initial Temperature: 600 K
Initial Delta: 0 Angstroms
Simulation data:
Total atoms : 4000000
Min global bounds : [ 0.0000000000, 0.0000000000, 0.0000000000 ]
Max global bounds : [ 361.5000000000, 361.5000000000, 361.5000000000 ]
Decomposition data:
Processors : 2, 1, 1
Local boxes : 31, 62, 62 = 119164
Box size : [ 5.8306451613, 5.8306451613, 5.8306451613 ]
Box factor : [ 1.0074548875, 1.0074548875, 1.0074548875 ]
Max Link Cell Occupancy: 32 of 64
Potential data:
Potential type : Lennard-Jones
Species name : Cu
Atomic number : 29
Mass : 63.55 amu
Lattice Type : FCC
Lattice spacing : 3.615 Angstroms
Cutoff : 5.7875 Angstroms
Epsilon : 0.167 eV
Sigma : 2.315 Angstroms
Initial energy : -1.166063303522, atom count : 4000000
Thu Feb 22 15:21:47 2024: Initialization Finished
Thu Feb 22 15:21:47 2024: Starting simulation
# Performance
# Loop Time(fs) Total Energy Potential Energy Kinetic Energy Temperature (us/atom) # Atoms
0 0.00 -1.166063303522 -1.243619295122 0.077555991600 600.0000 0.0000 4000000
10 10.00 -1.166059650500 -1.233157709949 0.067098059449 519.0938 0.0733 4000000
20 20.00 -1.166048438416 -1.208183014318 0.042134575902 325.9677 0.0792 4000000
30 30.00 -1.166037590737 -1.186586197151 0.020548606414 158.9711 0.0814 4000000
40 40.00 -1.166042093134 -1.183625399859 0.017583306724 136.0305 0.0816 4000000
50 50.00 -1.166051684893 -1.193713710258 0.027662025365 214.0030 0.0820 4000000
60 60.00 -1.166054646931 -1.202662201513 0.036607554582 283.2087 0.0818 4000000
70 70.00 -1.166052143011 -1.204911990844 0.038859847833 300.6332 0.0818 4000000
80 80.00 -1.166048803912 -1.203635015020 0.037586211108 290.7799 0.0812 4000000
90 90.00 -1.166048006780 -1.203820491599 0.037772484818 292.2210 0.0819 4000000
100 100.00 -1.166049793504 -1.206862845060 0.040813051556 315.7439 0.0809 4000000
Thu Feb 22 15:22:03 2024: Ending simulation
Simulation Validation:
Initial energy : -1.166063303522
Final energy : -1.166049793504
eFinal/eInitial : 0.999988
Final atom count : 4000000, no atoms lost
Timings for Rank 0
Timer # Calls Avg/Call (s) Total (s) % Loop
___________________________________________________________________
total 1 16.4519 16.4519 102.16
loop 1 16.1039 16.1039 100.00
timestep 10 1.6103 16.1033 100.00
position 100 0.0009 0.0856 0.53
velocity 200 0.0008 0.1502 0.93
redistribute 101 0.0560 5.6560 35.12
atomHalo 101 0.0208 2.0972 13.02
force 101 0.1023 10.3352 64.18
commHalo 303 0.0027 0.8226 5.11
commReduce 39 0.0001 0.0034 0.02
Timing Statistics Across 2 Ranks:
Timer Rank: Min(s) Rank: Max(s) Avg(s) Stdev(s)
_____________________________________________________________________________
total 0: 16.4519 1: 16.4520 16.4519 0.0000
loop 0: 16.1039 1: 16.1039 16.1039 0.0000
timestep 0: 16.1033 1: 16.1036 16.1035 0.0001
position 0: 0.0856 1: 0.0961 0.0908 0.0052
velocity 1: 0.1500 0: 0.1502 0.1501 0.0001
redistribute 0: 5.6560 1: 5.8196 5.7378 0.0818
atomHalo 1: 1.4375 0: 2.0972 1.7674 0.3298
force 1: 10.1454 0: 10.3352 10.2403 0.0949
commHalo 1: 0.1341 0: 0.8226 0.4783 0.3442
commReduce 0: 0.0034 1: 0.0283 0.0158 0.0125
---------------------------------------------------
Average atom update rate: 0.08 us/atom/task
---------------------------------------------------
---------------------------------------------------
Average all atom update rate: 0.04 us/atom
---------------------------------------------------
---------------------------------------------------
Average atom rate: 24.84 atoms/us
---------------------------------------------------
Thu Feb 22 15:22:03 2024: CoMD Ending
* Info: Process finished (host gmz17.benchmarkcenter.megware.com, process 154837)
* Warning: (host gmz17.benchmarkcenter.megware.com, process 154837) Observed more threads (33) than expected (32): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=33.
* Info: Process finished (host gmz17.benchmarkcenter.megware.com, process 154838)
* Warning: (host gmz17.benchmarkcenter.megware.com, process 154838) Observed more threads (33) than expected (32): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=33.
Your experiment path is /beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_5
To display your profiling results:
##################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
##################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_5 #
# Functions | Per-node | maqao lprof -df -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_5 #
# Functions | Per-process | maqao lprof -df -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_5 #
# Functions | Per-thread | maqao lprof -df -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_5 #
# Loops | Cluster-wide | maqao lprof -dl xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_5 #
# Loops | Per-node | maqao lprof -dl -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_5 #
# Loops | Per-process | maqao lprof -dl -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_5 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_5 #
##################################################################################################################################################################################################
* Info: Detected 2 Lprof instances in gmz17.benchmarkcenter.megware.com: processes-per-node/ppn set accordingly.
If this is incorrect, rerun with an explicit value for this setting
* Info: Selecting the 'perf-high-ppn' engine for node gmz17.benchmarkcenter.megware.com
* Info: "ref-cycles" not supported on gmz17.benchmarkcenter.megware.com: fallback to "cpu-clock"
* Warning: Found no event able to derive walltime: prepending cpu-clock
* Info: Process launched (host gmz17.benchmarkcenter.megware.com, process 155016)
* Info: Process launched (host gmz17.benchmarkcenter.megware.com, process 155015)Thu Feb 22 15:22:14 2024: Starting Initialization
Mini-Application Name : CoMD-openmp-mpi
Mini-Application Version : 1.1
Platform:
hostname: gmz16.benchmarkcenter.megware.com
kernel name: 'Linux'
kernel release: '5.14.0-362.13.1.el9_3.x86_64'
processor: 'x86_64'
Build:
CC: '/cluster/intel/oneapi/2024.0.0/mpi/2021.11/bin/mpiicc'
compiler version: 'unknown'
CFLAGS: '-O3 -march=native -DDO_MPI -O3 -march=haswell -fno-tree-vectorize -fno-openmp-simd -funroll-loops -ffast-math -g -grecord-gcc-switches -fno-omit-frame-pointer -fcf-protection=none -no-pie -cc=gcc -fopenmp -funroll-loops'
LDFLAGS: ' '
using MPI: true
Threading: OpenMP (64 threads)
Double Precision: true
Run Date/Time: 2024-02-22, 15:22:14
Command Line Parameters:
doeam: 0
potDir: pots
potName: Cu_u6.eam
potType: funcfl
nx: 100
ny: 100
nz: 100
xproc: 2
yproc: 1
zproc: 1
Lattice constant: -1 Angstroms
nSteps: 100
printRate: 10
Time step: 1 fs
Initial Temperature: 600 K
Initial Delta: 0 Angstroms
Simulation data:
Total atoms : 4000000
Min global bounds : [ 0.0000000000, 0.0000000000, 0.0000000000 ]
Max global bounds : [ 361.5000000000, 361.5000000000, 361.5000000000 ]
Decomposition data:
Processors : 2, 1, 1
Local boxes : 31, 62, 62 = 119164
Box size : [ 5.8306451613, 5.8306451613, 5.8306451613 ]
Box factor : [ 1.0074548875, 1.0074548875, 1.0074548875 ]
Max Link Cell Occupancy: 32 of 64
Potential data:
Potential type : Lennard-Jones
Species name : Cu
Atomic number : 29
Mass : 63.55 amu
Lattice Type : FCC
Lattice spacing : 3.615 Angstroms
Cutoff : 5.7875 Angstroms
Epsilon : 0.167 eV
Sigma : 2.315 Angstroms
Initial energy : -1.166063303459, atom count : 4000000
Thu Feb 22 15:22:14 2024: Initialization Finished
Thu Feb 22 15:22:14 2024: Starting simulation
# Performance
# Loop Time(fs) Total Energy Potential Energy Kinetic Energy Temperature (us/atom) # Atoms
0 0.00 -1.166063303459 -1.243619295059 0.077555991600 600.0000 0.0000 4000000
10 10.00 -1.166059650500 -1.233157709949 0.067098059449 519.0938 0.0584 4000000
20 20.00 -1.166048438416 -1.208183014318 0.042134575902 325.9677 0.0589 4000000
30 30.00 -1.166037590737 -1.186586197151 0.020548606414 158.9711 0.0590 4000000
40 40.00 -1.166042093134 -1.183625399859 0.017583306724 136.0305 0.0595 4000000
50 50.00 -1.166051684893 -1.193713710258 0.027662025365 214.0030 0.0595 4000000
60 60.00 -1.166054646931 -1.202662201513 0.036607554582 283.2087 0.0591 4000000
70 70.00 -1.166052143011 -1.204911990844 0.038859847833 300.6332 0.0591 4000000
80 80.00 -1.166048803912 -1.203635015020 0.037586211108 290.7799 0.0569 4000000
90 90.00 -1.166048006780 -1.203820491599 0.037772484818 292.2210 0.0566 4000000
100 100.00 -1.166049793504 -1.206862845060 0.040813051556 315.7439 0.0566 4000000
Thu Feb 22 15:22:26 2024: Ending simulation
Simulation Validation:
Initial energy : -1.166063303459
Final energy : -1.166049793504
eFinal/eInitial : 0.999988
Final atom count : 4000000, no atoms lost
Timings for Rank 0
Timer # Calls Avg/Call (s) Total (s) % Loop
___________________________________________________________________
total 1 12.0068 12.0068 102.84
loop 1 11.6747 11.6747 100.00
timestep 10 1.1673 11.6735 99.99
position 100 0.0008 0.0779 0.67
velocity 200 0.0006 0.1150 0.98
redistribute 101 0.0585 5.9078 50.60
atomHalo 101 0.0253 2.5562 21.90
force 101 0.0560 5.6591 48.47
commHalo 303 0.0038 1.1412 9.77
commReduce 39 0.0006 0.0228 0.20
Timing Statistics Across 2 Ranks:
Timer Rank: Min(s) Rank: Max(s) Avg(s) Stdev(s)
_____________________________________________________________________________
total 0: 12.0068 1: 12.0070 12.0069 0.0001
loop 0: 11.6747 1: 11.6748 11.6748 0.0000
timestep 0: 11.6735 1: 11.6737 11.6736 0.0001
position 0: 0.0779 1: 0.0895 0.0837 0.0058
velocity 0: 0.1150 1: 0.1167 0.1158 0.0009
redistribute 0: 5.9078 1: 5.9211 5.9145 0.0066
atomHalo 1: 1.6403 0: 2.5562 2.0982 0.4580
force 1: 5.6404 0: 5.6591 5.6497 0.0094
commHalo 1: 0.1224 0: 1.1412 0.6318 0.5094
commReduce 1: 0.0039 0: 0.0228 0.0134 0.0095
---------------------------------------------------
Average atom update rate: 0.06 us/atom/task
---------------------------------------------------
---------------------------------------------------
Average all atom update rate: 0.03 us/atom
---------------------------------------------------
---------------------------------------------------
Average atom rate: 34.27 atoms/us
---------------------------------------------------
Thu Feb 22 15:22:26 2024: CoMD Ending
* Info: Process finished (host gmz17.benchmarkcenter.megware.com, process 155015)
* Warning: (host gmz17.benchmarkcenter.megware.com, process 155015) Observed more threads (65) than expected (64): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=65.
* Info: Process finished (host gmz17.benchmarkcenter.megware.com, process 155016)
* Warning: (host gmz17.benchmarkcenter.megware.com, process 155016) Observed more threads (65) than expected (64): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=65.
Your experiment path is /beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_6
To display your profiling results:
##################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
##################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_6 #
# Functions | Per-node | maqao lprof -df -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_6 #
# Functions | Per-process | maqao lprof -df -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_6 #
# Functions | Per-thread | maqao lprof -df -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_6 #
# Loops | Cluster-wide | maqao lprof -dl xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_6 #
# Loops | Per-node | maqao lprof -dl -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_6 #
# Loops | Per-process | maqao lprof -dl -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_6 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_6 #
##################################################################################################################################################################################################
* Info: Detected 2 Lprof instances in gmz17.benchmarkcenter.megware.com: processes-per-node/ppn set accordingly.
If this is incorrect, rerun with an explicit value for this setting
* Info: Selecting the 'perf-high-ppn' engine for node gmz17.benchmarkcenter.megware.com
* Info: "ref-cycles" not supported on gmz17.benchmarkcenter.megware.com: fallback to "cpu-clock"
* Warning: Found no event able to derive walltime: prepending cpu-clock
* Info: Process launched (host gmz17.benchmarkcenter.megware.com, process 155340)
* Info: Process launched (host gmz17.benchmarkcenter.megware.com, process 155344)Thu Feb 22 15:22:50 2024: Starting Initialization
Mini-Application Name : CoMD-openmp-mpi
Mini-Application Version : 1.1
Platform:
hostname: gmz16.benchmarkcenter.megware.com
kernel name: 'Linux'
kernel release: '5.14.0-362.13.1.el9_3.x86_64'
processor: 'x86_64'
Build:
CC: '/cluster/intel/oneapi/2024.0.0/mpi/2021.11/bin/mpiicc'
compiler version: 'unknown'
CFLAGS: '-O3 -march=native -DDO_MPI -O3 -march=haswell -fno-tree-vectorize -fno-openmp-simd -funroll-loops -ffast-math -g -grecord-gcc-switches -fno-omit-frame-pointer -fcf-protection=none -no-pie -cc=gcc -fopenmp -funroll-loops'
LDFLAGS: ' '
using MPI: true
Threading: OpenMP (96 threads)
Double Precision: true
Run Date/Time: 2024-02-22, 15:22:50
Command Line Parameters:
doeam: 0
potDir: pots
potName: Cu_u6.eam
potType: funcfl
nx: 100
ny: 100
nz: 100
xproc: 2
yproc: 1
zproc: 1
Lattice constant: -1 Angstroms
nSteps: 100
printRate: 10
Time step: 1 fs
Initial Temperature: 600 K
Initial Delta: 0 Angstroms
Simulation data:
Total atoms : 4000000
Min global bounds : [ 0.0000000000, 0.0000000000, 0.0000000000 ]
Max global bounds : [ 361.5000000000, 361.5000000000, 361.5000000000 ]
Decomposition data:
Processors : 2, 1, 1
Local boxes : 31, 62, 62 = 119164
Box size : [ 5.8306451613, 5.8306451613, 5.8306451613 ]
Box factor : [ 1.0074548875, 1.0074548875, 1.0074548875 ]
Max Link Cell Occupancy: 32 of 64
Potential data:
Potential type : Lennard-Jones
Species name : Cu
Atomic number : 29
Mass : 63.55 amu
Lattice Type : FCC
Lattice spacing : 3.615 Angstroms
Cutoff : 5.7875 Angstroms
Epsilon : 0.167 eV
Sigma : 2.315 Angstroms
Initial energy : -1.166063303456, atom count : 4000000
Thu Feb 22 15:22:51 2024: Initialization Finished
Thu Feb 22 15:22:51 2024: Starting simulation
# Performance
# Loop Time(fs) Total Energy Potential Energy Kinetic Energy Temperature (us/atom) # Atoms
0 0.00 -1.166063303456 -1.243619295056 0.077555991600 600.0000 0.0000 4000000
10 10.00 -1.166059650500 -1.233157709949 0.067098059449 519.0938 0.0537 4000000
20 20.00 -1.166048438416 -1.208183014318 0.042134575902 325.9677 0.0545 4000000
30 30.00 -1.166037590737 -1.186586197151 0.020548606414 158.9711 0.0511 4000000
40 40.00 -1.166042093134 -1.183625399859 0.017583306724 136.0305 0.0509 4000000
50 50.00 -1.166051684893 -1.193713710258 0.027662025365 214.0030 0.0511 4000000
60 60.00 -1.166054646931 -1.202662201513 0.036607554582 283.2087 0.0515 4000000
70 70.00 -1.166052143011 -1.204911990844 0.038859847833 300.6332 0.0507 4000000
80 80.00 -1.166048803912 -1.203635015020 0.037586211108 290.7799 0.0507 4000000
90 90.00 -1.166048006780 -1.203820491599 0.037772484818 292.2210 0.0506 4000000
100 100.00 -1.166049793504 -1.206862845060 0.040813051556 315.7439 0.0504 4000000
Thu Feb 22 15:23:01 2024: Ending simulation
Simulation Validation:
Initial energy : -1.166063303456
Final energy : -1.166049793504
eFinal/eInitial : 0.999988
Final atom count : 4000000, no atoms lost
Timings for Rank 0
Timer # Calls Avg/Call (s) Total (s) % Loop
___________________________________________________________________
total 1 10.6488 10.6488 103.32
loop 1 10.3065 10.3065 100.00
timestep 10 1.0304 10.3037 99.97
position 100 0.0006 0.0603 0.59
velocity 200 0.0007 0.1454 1.41
redistribute 101 0.0594 6.0024 58.24
atomHalo 101 0.0265 2.6731 25.94
force 101 0.0413 4.1691 40.45
commHalo 303 0.0041 1.2538 12.17
commReduce 39 0.0006 0.0215 0.21
Timing Statistics Across 2 Ranks:
Timer Rank: Min(s) Rank: Max(s) Avg(s) Stdev(s)
_____________________________________________________________________________
total 0: 10.6488 1: 10.6490 10.6489 0.0001
loop 0: 10.3065 1: 10.3065 10.3065 0.0000
timestep 0: 10.3037 1: 10.3040 10.3039 0.0001
position 0: 0.0603 1: 0.0644 0.0623 0.0020
velocity 0: 0.1454 1: 0.1671 0.1562 0.0109
redistribute 0: 6.0024 1: 6.0336 6.0180 0.0156
atomHalo 1: 1.7013 0: 2.6731 2.1872 0.4859
force 1: 4.1284 0: 4.1691 4.1488 0.0203
commHalo 1: 0.1220 0: 1.2538 0.6879 0.5659
commReduce 1: 0.0073 0: 0.0215 0.0144 0.0071
---------------------------------------------------
Average atom update rate: 0.05 us/atom/task
---------------------------------------------------
---------------------------------------------------
Average all atom update rate: 0.03 us/atom
---------------------------------------------------
---------------------------------------------------
Average atom rate: 38.82 atoms/us
---------------------------------------------------
Thu Feb 22 15:23:01 2024: CoMD Ending
* Info: Process finished (host gmz17.benchmarkcenter.megware.com, process 155340)
* Warning: (host gmz17.benchmarkcenter.megware.com, process 155340) Observed more threads (97) than expected (96): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=97.
* Info: Process finished (host gmz17.benchmarkcenter.megware.com, process 155344)
* Warning: (host gmz17.benchmarkcenter.megware.com, process 155344) Observed more threads (97) than expected (96): in case of high IO overhead or suspicious profile, rerun with maximum-threads-per-process=97.
Your experiment path is /beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_7
To display your profiling results:
##################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
##################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_7 #
# Functions | Per-node | maqao lprof -df -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_7 #
# Functions | Per-process | maqao lprof -df -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_7 #
# Functions | Per-thread | maqao lprof -df -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_7 #
# Loops | Cluster-wide | maqao lprof -dl xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_7 #
# Loops | Per-node | maqao lprof -dl -dn xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_7 #
# Loops | Per-process | maqao lprof -dl -dp xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_7 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/beegfs/hackathon/users/eoseret/qaas_runs/CoMD/intel/CoMD/run/oneview_runs/compilers/gcc_4/oneview_results_scal/tools/lprof_npsu_run_7 #
##################################################################################################################################################################################################