* Info: Selecting the 'perf-high-ppn' engine for node idp09.benchmarkcenter.megware.com
* Info: Process launched (host idp09.benchmarkcenter.megware.com, process 291918)
* Warning: Found no event able to derive walltime: prepending ref-cycles
* Info: Process launched (host idp09.benchmarkcenter.megware.com, process 291923)
_ __ _ _
| |/ / (_) | |
| ' / _ __ _ _ __ | | __ ___
| < | '__|| || '_ \ | |/ // _ \
| . \ | | | || |_) || <| __/
|_|\_\|_| |_|| .__/ |_|\_\\___|
| |
|_| Version 1.2.4
LLNL-CODE-775068
Copyright (c) 2014-2019, Lawrence Livermore National Security, LLC
Kripke is released under the BSD 3-Clause License, please see the
LICENSE file for the full license
This work was produced under the auspices of the U.S. Department of
Energy by Lawrence Livermore National Laboratory under Contract
DE-AC52-07NA27344.
Author: Adam J. Kunen
Compilation Options:
Architecture: OpenMP
Compiler: /cluster/intel/oneapi/2024.0.0/mpi/2021.11/bin/mpiicpc
Compiler Flags: "-O3 -march=native -O2 -march=sapphirerapids -mprefer-vector-width=512 -funroll-loops -ffast-math -g -fno-omit-frame-pointer -fcf-protection=none -no-pie -cxx=g++ -Wall -Wextra "
Linker Flags: " "
CHAI Enabled: No
CUDA Enabled: No
MPI Enabled: Yes
OpenMP Enabled: Yes
Caliper Enabled: No
OpenMP Thread->Core mapping for 1 threads on rank 0
0-> 0
Input Parameters
================
Problem Size:
Zones: 16 x 16 x 16 (4096 total)
Groups: 1024
Legendre Order: 4
Quadrature Set: Dummy S2 with 96 points
Physical Properties:
Total X-Sec: sigt=[0.100000, 0.000100, 0.100000]
Scattering X-Sec: sigs=[0.050000, 0.000050, 0.050000]
Solver Options:
Number iterations: 10
MPI Decomposition Options:
Total MPI tasks: 2
Spatial decomp: 2 x 1 x 1 MPI tasks
Block solve method: Sweep
Per-Task Options:
DirSets/Directions: 8 sets, 12 directions/set
GroupSet/Groups: 2 sets, 512 groups/set
Zone Sets: 1 x 1 x 1
Architecture: OpenMP
Data Layout: DGZ
Generating Problem
==================
Decomposition Space: Procs: Subdomains (local/global):
--------------------- ---------- --------------------------
(P) Energy: 1 2 / 2
(Q) Direction: 1 8 / 8
(R) Space: 2 1 / 2
(Rx,Ry,Rz) R in XYZ: 2x1x1 1x1x1 / 2x1x1
(PQR) TOTAL: 2 16 / 32
Material Volumes=[8.789062e+03, 1.177734e+05, 2.753438e+06]
Memory breakdown of Field variables:
Field Variable Num Elements Megabytes
-------------- ------------ ---------
data/sigs 15728640 120.000
dx 16 0.000
dy 16 0.000
dz 16 0.000
ell 2400 0.018
ell_plus 2400 0.018
i_plane 25165824 192.000
j_plane 25165824 192.000
k_plane 25165824 192.000
mixelem_to_fraction 4352 0.033
phi 104857600 800.000
phi_out 104857600 800.000
psi 402653184 3072.000
quadrature/w 96 0.001
quadrature/xcos 96 0.001
quadrature/ycos 96 0.001
quadrature/zcos 96 0.001
rhs 402653184 3072.000
sigt_zonal 4194304 32.000
volume 4096 0.031
-------- ------------ ---------
TOTAL 1110455664 8472.104
Generation Complete!
Steady State Solve
==================
iter 0: particle count=1.197998e+09, change=1.000000e+00
iter 1: particle count=1.801368e+09, change=3.349511e-01
iter 2: particle count=2.102278e+09, change=1.431351e-01
iter 3: particle count=2.251810e+09, change=6.640521e-02
iter 4: particle count=2.325888e+09, change=3.184924e-02
iter 5: particle count=2.362467e+09, change=1.548355e-02
iter 6: particle count=2.380471e+09, change=7.563193e-03
iter 7: particle count=2.389305e+09, change=3.697158e-03
iter 8: particle count=2.393627e+09, change=1.805479e-03
iter 9: particle count=2.395735e+09, change=8.801810e-04
Solver terminated
Timers
======
Timer Count Seconds
---------------- ------------ ------------
Generate 1 0.01863
LPlusTimes 10 48.85946
LTimes 10 47.98556
Population 10 2.43956
Scattering 10 926.58573
Solve 1 1041.63222
Source 10 0.01012
SweepSolver 10 14.69851
SweepSubdomain 160 14.04403
TIMER_NAMES:Generate,LPlusTimes,LTimes,Population,Scattering,Solve,Source,SweepSolver,SweepSubdomain
TIMER_DATA:0.018629,48.859463,47.985561,2.439560,926.585725,1041.632218,0.010116,14.698511,14.044030
Figures of Merit
================
Throughput: 3.865598e+06 [unknowns/(second/iteration)]
Grind time : 2.586922e-07 [(seconds/iteration)/unknowns]
Sweep efficiency : 95.54730 [100.0 * SweepSubdomain time / SweepSolver time]
Number of unknowns: 402653184
END
* Info: Process finished (host idp09.benchmarkcenter.megware.com, process 291918)
* Info: Process finished (host idp09.benchmarkcenter.megware.com, process 291923)
Your experiment path is /home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_0
To display your profiling results:
#################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_0 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_0 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_0 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_0 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_0 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_0 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_0 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_0 #
#################################################################################################################################################################################################
* Info: Selecting the 'perf-high-ppn' engine for node idp09.benchmarkcenter.megware.com
* Info: Process launched (host idp09.benchmarkcenter.megware.com, process 292055)
* Warning: Found no event able to derive walltime: prepending ref-cycles
* Info: Process launched (host idp09.benchmarkcenter.megware.com, process 292060)
_ __ _ _
| |/ / (_) | |
| ' / _ __ _ _ __ | | __ ___
| < | '__|| || '_ \ | |/ // _ \
| . \ | | | || |_) || <| __/
|_|\_\|_| |_|| .__/ |_|\_\\___|
| |
|_| Version 1.2.4
LLNL-CODE-775068
Copyright (c) 2014-2019, Lawrence Livermore National Security, LLC
Kripke is released under the BSD 3-Clause License, please see the
LICENSE file for the full license
This work was produced under the auspices of the U.S. Department of
Energy by Lawrence Livermore National Laboratory under Contract
DE-AC52-07NA27344.
Author: Adam J. Kunen
Compilation Options:
Architecture: OpenMP
Compiler: /cluster/intel/oneapi/2024.0.0/mpi/2021.11/bin/mpiicpc
Compiler Flags: "-O3 -march=native -O2 -march=sapphirerapids -mprefer-vector-width=512 -funroll-loops -ffast-math -g -fno-omit-frame-pointer -fcf-protection=none -no-pie -cxx=g++ -Wall -Wextra "
Linker Flags: " "
CHAI Enabled: No
CUDA Enabled: No
MPI Enabled: Yes
OpenMP Enabled: Yes
Caliper Enabled: No
OpenMP Thread->Core mapping for 2 threads on rank 0
0-> 0 1-> 24
Input Parameters
================
Problem Size:
Zones: 16 x 16 x 16 (4096 total)
Groups: 1024
Legendre Order: 4
Quadrature Set: Dummy S2 with 96 points
Physical Properties:
Total X-Sec: sigt=[0.100000, 0.000100, 0.100000]
Scattering X-Sec: sigs=[0.050000, 0.000050, 0.050000]
Solver Options:
Number iterations: 10
MPI Decomposition Options:
Total MPI tasks: 2
Spatial decomp: 2 x 1 x 1 MPI tasks
Block solve method: Sweep
Per-Task Options:
DirSets/Directions: 8 sets, 12 directions/set
GroupSet/Groups: 2 sets, 512 groups/set
Zone Sets: 1 x 1 x 1
Architecture: OpenMP
Data Layout: DGZ
Generating Problem
==================
Decomposition Space: Procs: Subdomains (local/global):
--------------------- ---------- --------------------------
(P) Energy: 1 2 / 2
(Q) Direction: 1 8 / 8
(R) Space: 2 1 / 2
(Rx,Ry,Rz) R in XYZ: 2x1x1 1x1x1 / 2x1x1
(PQR) TOTAL: 2 16 / 32
Material Volumes=[8.789062e+03, 1.177734e+05, 2.753438e+06]
Memory breakdown of Field variables:
Field Variable Num Elements Megabytes
-------------- ------------ ---------
data/sigs 15728640 120.000
dx 16 0.000
dy 16 0.000
dz 16 0.000
ell 2400 0.018
ell_plus 2400 0.018
i_plane 25165824 192.000
j_plane 25165824 192.000
k_plane 25165824 192.000
mixelem_to_fraction 4352 0.033
phi 104857600 800.000
phi_out 104857600 800.000
psi 402653184 3072.000
quadrature/w 96 0.001
quadrature/xcos 96 0.001
quadrature/ycos 96 0.001
quadrature/zcos 96 0.001
rhs 402653184 3072.000
sigt_zonal 4194304 32.000
volume 4096 0.031
-------- ------------ ---------
TOTAL 1110455664 8472.104
Generation Complete!
Steady State Solve
==================
iter 0: particle count=1.197998e+09, change=1.000000e+00
iter 1: particle count=1.801368e+09, change=3.349511e-01
iter 2: particle count=2.102278e+09, change=1.431351e-01
iter 3: particle count=2.251810e+09, change=6.640521e-02
iter 4: particle count=2.325888e+09, change=3.184924e-02
iter 5: particle count=2.362467e+09, change=1.548355e-02
iter 6: particle count=2.380471e+09, change=7.563193e-03
iter 7: particle count=2.389305e+09, change=3.697158e-03
iter 8: particle count=2.393627e+09, change=1.805479e-03
iter 9: particle count=2.395735e+09, change=8.801810e-04
Solver terminated
Timers
======
Timer Count Seconds
---------------- ------------ ------------
Generate 1 0.01970
LPlusTimes 10 27.48438
LTimes 10 29.81147
Population 10 1.24365
Scattering 10 465.17646
Solve 1 540.76831
Source 10 0.00591
SweepSolver 10 15.98915
SweepSubdomain 160 8.56812
TIMER_NAMES:Generate,LPlusTimes,LTimes,Population,Scattering,Solve,Source,SweepSolver,SweepSubdomain
TIMER_DATA:0.019701,27.484381,29.811470,1.243646,465.176457,540.768311,0.005915,15.989155,8.568121
Figures of Merit
================
Throughput: 7.445946e+06 [unknowns/(second/iteration)]
Grind time : 1.343013e-07 [(seconds/iteration)/unknowns]
Sweep efficiency : 53.58708 [100.0 * SweepSubdomain time / SweepSolver time]
Number of unknowns: 402653184
END
* Info: Process finished (host idp09.benchmarkcenter.megware.com, process 292060)
* Info: Process finished (host idp09.benchmarkcenter.megware.com, process 292055)
Info: 1/2 lprof instances finished
Your experiment path is /home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_1
To display your profiling results:
#################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_1 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_1 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_1 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_1 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_1 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_1 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_1 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_1 #
#################################################################################################################################################################################################
* Info: Selecting the 'perf-high-ppn' engine for node idp09.benchmarkcenter.megware.com
* Info: Process launched (host idp09.benchmarkcenter.megware.com, process 292164)
* Warning: Found no event able to derive walltime: prepending ref-cycles
* Info: Process launched (host idp09.benchmarkcenter.megware.com, process 292169)
_ __ _ _
| |/ / (_) | |
| ' / _ __ _ _ __ | | __ ___
| < | '__|| || '_ \ | |/ // _ \
| . \ | | | || |_) || <| __/
|_|\_\|_| |_|| .__/ |_|\_\\___|
| |
|_| Version 1.2.4
LLNL-CODE-775068
Copyright (c) 2014-2019, Lawrence Livermore National Security, LLC
Kripke is released under the BSD 3-Clause License, please see the
LICENSE file for the full license
This work was produced under the auspices of the U.S. Department of
Energy by Lawrence Livermore National Laboratory under Contract
DE-AC52-07NA27344.
Author: Adam J. Kunen
Compilation Options:
Architecture: OpenMP
Compiler: /cluster/intel/oneapi/2024.0.0/mpi/2021.11/bin/mpiicpc
Compiler Flags: "-O3 -march=native -O2 -march=sapphirerapids -mprefer-vector-width=512 -funroll-loops -ffast-math -g -fno-omit-frame-pointer -fcf-protection=none -no-pie -cxx=g++ -Wall -Wextra "
Linker Flags: " "
CHAI Enabled: No
CUDA Enabled: No
MPI Enabled: Yes
OpenMP Enabled: Yes
Caliper Enabled: No
OpenMP Thread->Core mapping for 4 threads on rank 0
0-> 0 1-> 12 2-> 24 3-> 36
Input Parameters
================
Problem Size:
Zones: 16 x 16 x 16 (4096 total)
Groups: 1024
Legendre Order: 4
Quadrature Set: Dummy S2 with 96 points
Physical Properties:
Total X-Sec: sigt=[0.100000, 0.000100, 0.100000]
Scattering X-Sec: sigs=[0.050000, 0.000050, 0.050000]
Solver Options:
Number iterations: 10
MPI Decomposition Options:
Total MPI tasks: 2
Spatial decomp: 2 x 1 x 1 MPI tasks
Block solve method: Sweep
Per-Task Options:
DirSets/Directions: 8 sets, 12 directions/set
GroupSet/Groups: 2 sets, 512 groups/set
Zone Sets: 1 x 1 x 1
Architecture: OpenMP
Data Layout: DGZ
Generating Problem
==================
Decomposition Space: Procs: Subdomains (local/global):
--------------------- ---------- --------------------------
(P) Energy: 1 2 / 2
(Q) Direction: 1 8 / 8
(R) Space: 2 1 / 2
(Rx,Ry,Rz) R in XYZ: 2x1x1 1x1x1 / 2x1x1
(PQR) TOTAL: 2 16 / 32
Material Volumes=[8.789062e+03, 1.177734e+05, 2.753438e+06]
Memory breakdown of Field variables:
Field Variable Num Elements Megabytes
-------------- ------------ ---------
data/sigs 15728640 120.000
dx 16 0.000
dy 16 0.000
dz 16 0.000
ell 2400 0.018
ell_plus 2400 0.018
i_plane 25165824 192.000
j_plane 25165824 192.000
k_plane 25165824 192.000
mixelem_to_fraction 4352 0.033
phi 104857600 800.000
phi_out 104857600 800.000
psi 402653184 3072.000
quadrature/w 96 0.001
quadrature/xcos 96 0.001
quadrature/ycos 96 0.001
quadrature/zcos 96 0.001
rhs 402653184 3072.000
sigt_zonal 4194304 32.000
volume 4096 0.031
-------- ------------ ---------
TOTAL 1110455664 8472.104
Generation Complete!
Steady State Solve
==================
iter 0: particle count=1.197998e+09, change=1.000000e+00
iter 1: particle count=1.801368e+09, change=3.349511e-01
iter 2: particle count=2.102278e+09, change=1.431351e-01
iter 3: particle count=2.251810e+09, change=6.640521e-02
iter 4: particle count=2.325888e+09, change=3.184924e-02
iter 5: particle count=2.362467e+09, change=1.548355e-02
iter 6: particle count=2.380471e+09, change=7.563193e-03
iter 7: particle count=2.389305e+09, change=3.697158e-03
iter 8: particle count=2.393627e+09, change=1.805479e-03
iter 9: particle count=2.395735e+09, change=8.801810e-04
Solver terminated
Timers
======
Timer Count Seconds
---------------- ------------ ------------
Generate 1 0.01965
LPlusTimes 10 14.85723
LTimes 10 16.40089
Population 10 0.76111
Scattering 10 241.42428
Solve 1 286.50605
Source 10 0.00386
SweepSolver 10 12.00208
SweepSubdomain 160 4.57279
TIMER_NAMES:Generate,LPlusTimes,LTimes,Population,Scattering,Solve,Source,SweepSolver,SweepSubdomain
TIMER_DATA:0.019651,14.857232,16.400891,0.761110,241.424276,286.506050,0.003865,12.002080,4.572790
Figures of Merit
================
Throughput: 1.405392e+07 [unknowns/(second/iteration)]
Grind time : 7.115455e-08 [(seconds/iteration)/unknowns]
Sweep efficiency : 38.09998 [100.0 * SweepSubdomain time / SweepSolver time]
Number of unknowns: 402653184
END
* Info: Process finished (host idp09.benchmarkcenter.megware.com, process 292164)
* Info: Process finished (host idp09.benchmarkcenter.megware.com, process 292169)
Your experiment path is /home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_2
To display your profiling results:
#################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_2 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_2 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_2 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_2 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_2 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_2 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_2 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_2 #
#################################################################################################################################################################################################
* Info: Selecting the 'perf-high-ppn' engine for node idp09.benchmarkcenter.megware.com
* Info: Process launched (host idp09.benchmarkcenter.megware.com, process 292261)
* Warning: Found no event able to derive walltime: prepending ref-cycles
* Info: Process launched (host idp09.benchmarkcenter.megware.com, process 292266)
_ __ _ _
| |/ / (_) | |
| ' / _ __ _ _ __ | | __ ___
| < | '__|| || '_ \ | |/ // _ \
| . \ | | | || |_) || <| __/
|_|\_\|_| |_|| .__/ |_|\_\\___|
| |
|_| Version 1.2.4
LLNL-CODE-775068
Copyright (c) 2014-2019, Lawrence Livermore National Security, LLC
Kripke is released under the BSD 3-Clause License, please see the
LICENSE file for the full license
This work was produced under the auspices of the U.S. Department of
Energy by Lawrence Livermore National Laboratory under Contract
DE-AC52-07NA27344.
Author: Adam J. Kunen
Compilation Options:
Architecture: OpenMP
Compiler: /cluster/intel/oneapi/2024.0.0/mpi/2021.11/bin/mpiicpc
Compiler Flags: "-O3 -march=native -O2 -march=sapphirerapids -mprefer-vector-width=512 -funroll-loops -ffast-math -g -fno-omit-frame-pointer -fcf-protection=none -no-pie -cxx=g++ -Wall -Wextra "
Linker Flags: " "
CHAI Enabled: No
CUDA Enabled: No
MPI Enabled: Yes
OpenMP Enabled: Yes
Caliper Enabled: No
OpenMP Thread->Core mapping for 8 threads on rank 0
0-> 0 1-> 6 2-> 12 3-> 18 4-> 24 5-> 30 6-> 36 7-> 42
Input Parameters
================
Problem Size:
Zones: 16 x 16 x 16 (4096 total)
Groups: 1024
Legendre Order: 4
Quadrature Set: Dummy S2 with 96 points
Physical Properties:
Total X-Sec: sigt=[0.100000, 0.000100, 0.100000]
Scattering X-Sec: sigs=[0.050000, 0.000050, 0.050000]
Solver Options:
Number iterations: 10
MPI Decomposition Options:
Total MPI tasks: 2
Spatial decomp: 2 x 1 x 1 MPI tasks
Block solve method: Sweep
Per-Task Options:
DirSets/Directions: 8 sets, 12 directions/set
GroupSet/Groups: 2 sets, 512 groups/set
Zone Sets: 1 x 1 x 1
Architecture: OpenMP
Data Layout: DGZ
Generating Problem
==================
Decomposition Space: Procs: Subdomains (local/global):
--------------------- ---------- --------------------------
(P) Energy: 1 2 / 2
(Q) Direction: 1 8 / 8
(R) Space: 2 1 / 2
(Rx,Ry,Rz) R in XYZ: 2x1x1 1x1x1 / 2x1x1
(PQR) TOTAL: 2 16 / 32
Material Volumes=[8.789062e+03, 1.177734e+05, 2.753438e+06]
Memory breakdown of Field variables:
Field Variable Num Elements Megabytes
-------------- ------------ ---------
data/sigs 15728640 120.000
dx 16 0.000
dy 16 0.000
dz 16 0.000
ell 2400 0.018
ell_plus 2400 0.018
i_plane 25165824 192.000
j_plane 25165824 192.000
k_plane 25165824 192.000
mixelem_to_fraction 4352 0.033
phi 104857600 800.000
phi_out 104857600 800.000
psi 402653184 3072.000
quadrature/w 96 0.001
quadrature/xcos 96 0.001
quadrature/ycos 96 0.001
quadrature/zcos 96 0.001
rhs 402653184 3072.000
sigt_zonal 4194304 32.000
volume 4096 0.031
-------- ------------ ---------
TOTAL 1110455664 8472.104
Generation Complete!
Steady State Solve
==================
iter 0: particle count=1.197998e+09, change=1.000000e+00
iter 1: particle count=1.801368e+09, change=3.349511e-01
iter 2: particle count=2.102278e+09, change=1.431351e-01
iter 3: particle count=2.251810e+09, change=6.640521e-02
iter 4: particle count=2.325888e+09, change=3.184924e-02
iter 5: particle count=2.362467e+09, change=1.548355e-02
iter 6: particle count=2.380471e+09, change=7.563193e-03
iter 7: particle count=2.389305e+09, change=3.697158e-03
iter 8: particle count=2.393627e+09, change=1.805479e-03
iter 9: particle count=2.395735e+09, change=8.801810e-04
Solver terminated
Timers
======
Timer Count Seconds
---------------- ------------ ------------
Generate 1 0.02043
LPlusTimes 10 9.51116
LTimes 10 8.58667
Population 10 0.85079
Scattering 10 140.17268
Solve 1 164.22890
Source 10 0.00289
SweepSolver 10 4.04279
SweepSubdomain 160 2.33546
TIMER_NAMES:Generate,LPlusTimes,LTimes,Population,Scattering,Solve,Source,SweepSolver,SweepSubdomain
TIMER_DATA:0.020426,9.511162,8.586667,0.850788,140.172681,164.228900,0.002890,4.042791,2.335459
Figures of Merit
================
Throughput: 2.451780e+07 [unknowns/(second/iteration)]
Grind time : 4.078669e-08 [(seconds/iteration)/unknowns]
Sweep efficiency : 57.76847 [100.0 * SweepSubdomain time / SweepSolver time]
Number of unknowns: 402653184
END
* Info: Process finished (host idp09.benchmarkcenter.megware.com, process 292266)
* Info: Process finished (host idp09.benchmarkcenter.megware.com, process 292261)
Info: 1/2 lprof instances finished
Your experiment path is /home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_3
To display your profiling results:
#################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_3 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_3 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_3 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_3 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_3 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_3 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_3 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_3 #
#################################################################################################################################################################################################
* Info: Selecting the 'perf-high-ppn' engine for node idp09.benchmarkcenter.megware.com
* Info: Process launched (host idp09.benchmarkcenter.megware.com, process 292370)
* Warning: Found no event able to derive walltime: prepending ref-cycles
* Info: Process launched (host idp09.benchmarkcenter.megware.com, process 292375)
_ __ _ _
| |/ / (_) | |
| ' / _ __ _ _ __ | | __ ___
| < | '__|| || '_ \ | |/ // _ \
| . \ | | | || |_) || <| __/
|_|\_\|_| |_|| .__/ |_|\_\\___|
| |
|_| Version 1.2.4
LLNL-CODE-775068
Copyright (c) 2014-2019, Lawrence Livermore National Security, LLC
Kripke is released under the BSD 3-Clause License, please see the
LICENSE file for the full license
This work was produced under the auspices of the U.S. Department of
Energy by Lawrence Livermore National Laboratory under Contract
DE-AC52-07NA27344.
Author: Adam J. Kunen
Compilation Options:
Architecture: OpenMP
Compiler: /cluster/intel/oneapi/2024.0.0/mpi/2021.11/bin/mpiicpc
Compiler Flags: "-O3 -march=native -O2 -march=sapphirerapids -mprefer-vector-width=512 -funroll-loops -ffast-math -g -fno-omit-frame-pointer -fcf-protection=none -no-pie -cxx=g++ -Wall -Wextra "
Linker Flags: " "
CHAI Enabled: No
CUDA Enabled: No
MPI Enabled: Yes
OpenMP Enabled: Yes
Caliper Enabled: No
OpenMP Thread->Core mapping for 16 threads on rank 0
0-> 0 1-> 3 2-> 6 3-> 9 4-> 12 5-> 15 6-> 18 7-> 21
8-> 24 9-> 27 10-> 30 11-> 33 12-> 36 13-> 39 14-> 42 15-> 45
Input Parameters
================
Problem Size:
Zones: 16 x 16 x 16 (4096 total)
Groups: 1024
Legendre Order: 4
Quadrature Set: Dummy S2 with 96 points
Physical Properties:
Total X-Sec: sigt=[0.100000, 0.000100, 0.100000]
Scattering X-Sec: sigs=[0.050000, 0.000050, 0.050000]
Solver Options:
Number iterations: 10
MPI Decomposition Options:
Total MPI tasks: 2
Spatial decomp: 2 x 1 x 1 MPI tasks
Block solve method: Sweep
Per-Task Options:
DirSets/Directions: 8 sets, 12 directions/set
GroupSet/Groups: 2 sets, 512 groups/set
Zone Sets: 1 x 1 x 1
Architecture: OpenMP
Data Layout: DGZ
Generating Problem
==================
Decomposition Space: Procs: Subdomains (local/global):
--------------------- ---------- --------------------------
(P) Energy: 1 2 / 2
(Q) Direction: 1 8 / 8
(R) Space: 2 1 / 2
(Rx,Ry,Rz) R in XYZ: 2x1x1 1x1x1 / 2x1x1
(PQR) TOTAL: 2 16 / 32
Material Volumes=[8.789062e+03, 1.177734e+05, 2.753438e+06]
Memory breakdown of Field variables:
Field Variable Num Elements Megabytes
-------------- ------------ ---------
data/sigs 15728640 120.000
dx 16 0.000
dy 16 0.000
dz 16 0.000
ell 2400 0.018
ell_plus 2400 0.018
i_plane 25165824 192.000
j_plane 25165824 192.000
k_plane 25165824 192.000
mixelem_to_fraction 4352 0.033
phi 104857600 800.000
phi_out 104857600 800.000
psi 402653184 3072.000
quadrature/w 96 0.001
quadrature/xcos 96 0.001
quadrature/ycos 96 0.001
quadrature/zcos 96 0.001
rhs 402653184 3072.000
sigt_zonal 4194304 32.000
volume 4096 0.031
-------- ------------ ---------
TOTAL 1110455664 8472.104
Generation Complete!
Steady State Solve
==================
iter 0: particle count=1.197998e+09, change=1.000000e+00
iter 1: particle count=1.801368e+09, change=3.349511e-01
iter 2: particle count=2.102278e+09, change=1.431351e-01
iter 3: particle count=2.251810e+09, change=6.640521e-02
iter 4: particle count=2.325888e+09, change=3.184924e-02
iter 5: particle count=2.362467e+09, change=1.548355e-02
iter 6: particle count=2.380471e+09, change=7.563193e-03
iter 7: particle count=2.389305e+09, change=3.697158e-03
iter 8: particle count=2.393627e+09, change=1.805479e-03
iter 9: particle count=2.395735e+09, change=8.801810e-04
Solver terminated
Timers
======
Timer Count Seconds
---------------- ------------ ------------
Generate 1 0.01910
LPlusTimes 10 5.11317
LTimes 10 4.60814
Population 10 0.47810
Scattering 10 76.94158
Solve 1 90.23037
Source 10 0.00229
SweepSolver 10 2.03002
SweepSubdomain 160 1.19074
TIMER_NAMES:Generate,LPlusTimes,LTimes,Population,Scattering,Solve,Source,SweepSolver,SweepSubdomain
TIMER_DATA:0.019099,5.113167,4.608137,0.478103,76.941581,90.230367,0.002286,2.030022,1.190741
Figures of Merit
================
Throughput: 4.462502e+07 [unknowns/(second/iteration)]
Grind time : 2.240895e-08 [(seconds/iteration)/unknowns]
Sweep efficiency : 58.65654 [100.0 * SweepSubdomain time / SweepSolver time]
Number of unknowns: 402653184
END
* Info: Process finished (host idp09.benchmarkcenter.megware.com, process 292370)
* Info: Process finished (host idp09.benchmarkcenter.megware.com, process 292375)
Your experiment path is /home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_4
To display your profiling results:
#################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_4 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_4 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_4 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_4 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_4 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_4 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_4 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_4 #
#################################################################################################################################################################################################
* Info: Selecting the 'perf-high-ppn' engine for node idp09.benchmarkcenter.megware.com
* Info: Process launched (host idp09.benchmarkcenter.megware.com, process 292507)
* Warning: Found no event able to derive walltime: prepending ref-cycles
* Info: Process launched (host idp09.benchmarkcenter.megware.com, process 292512)
_ __ _ _
| |/ / (_) | |
| ' / _ __ _ _ __ | | __ ___
| < | '__|| || '_ \ | |/ // _ \
| . \ | | | || |_) || <| __/
|_|\_\|_| |_|| .__/ |_|\_\\___|
| |
|_| Version 1.2.4
LLNL-CODE-775068
Copyright (c) 2014-2019, Lawrence Livermore National Security, LLC
Kripke is released under the BSD 3-Clause License, please see the
LICENSE file for the full license
This work was produced under the auspices of the U.S. Department of
Energy by Lawrence Livermore National Laboratory under Contract
DE-AC52-07NA27344.
Author: Adam J. Kunen
Compilation Options:
Architecture: OpenMP
Compiler: /cluster/intel/oneapi/2024.0.0/mpi/2021.11/bin/mpiicpc
Compiler Flags: "-O3 -march=native -O2 -march=sapphirerapids -mprefer-vector-width=512 -funroll-loops -ffast-math -g -fno-omit-frame-pointer -fcf-protection=none -no-pie -cxx=g++ -Wall -Wextra "
Linker Flags: " "
CHAI Enabled: No
CUDA Enabled: No
MPI Enabled: Yes
OpenMP Enabled: Yes
Caliper Enabled: No
OpenMP Thread->Core mapping for 32 threads on rank 0
0-> 0 1-> 97 2-> 3 3->100 4-> 6 5->103 6-> 9 7->106
8-> 12 9->109 10-> 15 11->112 12-> 18 13->115 14-> 21 15->118
16-> 24 17->121 18-> 27 19->124 20-> 30 21->127 22-> 33 23->130
24-> 36 25->133 26-> 39 27->136 28-> 42 29->139 30-> 45 31->142
Input Parameters
================
Problem Size:
Zones: 16 x 16 x 16 (4096 total)
Groups: 1024
Legendre Order: 4
Quadrature Set: Dummy S2 with 96 points
Physical Properties:
Total X-Sec: sigt=[0.100000, 0.000100, 0.100000]
Scattering X-Sec: sigs=[0.050000, 0.000050, 0.050000]
Solver Options:
Number iterations: 10
MPI Decomposition Options:
Total MPI tasks: 2
Spatial decomp: 2 x 1 x 1 MPI tasks
Block solve method: Sweep
Per-Task Options:
DirSets/Directions: 8 sets, 12 directions/set
GroupSet/Groups: 2 sets, 512 groups/set
Zone Sets: 1 x 1 x 1
Architecture: OpenMP
Data Layout: DGZ
Generating Problem
==================
Decomposition Space: Procs: Subdomains (local/global):
--------------------- ---------- --------------------------
(P) Energy: 1 2 / 2
(Q) Direction: 1 8 / 8
(R) Space: 2 1 / 2
(Rx,Ry,Rz) R in XYZ: 2x1x1 1x1x1 / 2x1x1
(PQR) TOTAL: 2 16 / 32
Material Volumes=[8.789062e+03, 1.177734e+05, 2.753438e+06]
Memory breakdown of Field variables:
Field Variable Num Elements Megabytes
-------------- ------------ ---------
data/sigs 15728640 120.000
dx 16 0.000
dy 16 0.000
dz 16 0.000
ell 2400 0.018
ell_plus 2400 0.018
i_plane 25165824 192.000
j_plane 25165824 192.000
k_plane 25165824 192.000
mixelem_to_fraction 4352 0.033
phi 104857600 800.000
phi_out 104857600 800.000
psi 402653184 3072.000
quadrature/w 96 0.001
quadrature/xcos 96 0.001
quadrature/ycos 96 0.001
quadrature/zcos 96 0.001
rhs 402653184 3072.000
sigt_zonal 4194304 32.000
volume 4096 0.031
-------- ------------ ---------
TOTAL 1110455664 8472.104
Generation Complete!
Steady State Solve
==================
iter 0: particle count=1.197998e+09, change=1.000000e+00
iter 1: particle count=1.801368e+09, change=3.349511e-01
iter 2: particle count=2.102278e+09, change=1.431351e-01
iter 3: particle count=2.251810e+09, change=6.640521e-02
iter 4: particle count=2.325888e+09, change=3.184924e-02
iter 5: particle count=2.362467e+09, change=1.548355e-02
iter 6: particle count=2.380471e+09, change=7.563193e-03
iter 7: particle count=2.389305e+09, change=3.697158e-03
iter 8: particle count=2.393627e+09, change=1.805479e-03
iter 9: particle count=2.395735e+09, change=8.801810e-04
Solver terminated
Timers
======
Timer Count Seconds
---------------- ------------ ------------
Generate 1 0.01895
LPlusTimes 10 2.95652
LTimes 10 3.16498
Population 10 0.41053
Scattering 10 53.67290
Solve 1 65.76153
Source 10 0.00214
SweepSolver 10 4.47819
SweepSubdomain 160 0.72801
TIMER_NAMES:Generate,LPlusTimes,LTimes,Population,Scattering,Solve,Source,SweepSolver,SweepSubdomain
TIMER_DATA:0.018945,2.956520,3.164980,0.410529,53.672897,65.761529,0.002142,4.478190,0.728007
Figures of Merit
================
Throughput: 6.122929e+07 [unknowns/(second/iteration)]
Grind time : 1.633205e-08 [(seconds/iteration)/unknowns]
Sweep efficiency : 16.25673 [100.0 * SweepSubdomain time / SweepSolver time]
Number of unknowns: 402653184
END
* Info: Process finished (host idp09.benchmarkcenter.megware.com, process 292507)
* Info: Process finished (host idp09.benchmarkcenter.megware.com, process 292512)
Info: 1/2 lprof instances finished
Your experiment path is /home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_5
To display your profiling results:
#################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_5 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_5 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_5 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_5 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_5 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_5 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_5 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_5 #
#################################################################################################################################################################################################
* Info: Selecting the 'perf-high-ppn' engine for node idp09.benchmarkcenter.megware.com
* Info: Process launched (host idp09.benchmarkcenter.megware.com, process 292706)
* Warning: Found no event able to derive walltime: prepending ref-cycles
* Info: Process launched (host idp09.benchmarkcenter.megware.com, process 292711)
_ __ _ _
| |/ / (_) | |
| ' / _ __ _ _ __ | | __ ___
| < | '__|| || '_ \ | |/ // _ \
| . \ | | | || |_) || <| __/
|_|\_\|_| |_|| .__/ |_|\_\\___|
| |
|_| Version 1.2.4
LLNL-CODE-775068
Copyright (c) 2014-2019, Lawrence Livermore National Security, LLC
Kripke is released under the BSD 3-Clause License, please see the
LICENSE file for the full license
This work was produced under the auspices of the U.S. Department of
Energy by Lawrence Livermore National Laboratory under Contract
DE-AC52-07NA27344.
Author: Adam J. Kunen
Compilation Options:
Architecture: OpenMP
Compiler: /cluster/intel/oneapi/2024.0.0/mpi/2021.11/bin/mpiicpc
Compiler Flags: "-O3 -march=native -O2 -march=sapphirerapids -mprefer-vector-width=512 -funroll-loops -ffast-math -g -fno-omit-frame-pointer -fcf-protection=none -no-pie -cxx=g++ -Wall -Wextra "
Linker Flags: " "
CHAI Enabled: No
CUDA Enabled: No
MPI Enabled: Yes
OpenMP Enabled: Yes
Caliper Enabled: No
OpenMP Thread->Core mapping for 48 threads on rank 0
0-> 0 1-> 1 2-> 2 3-> 3 4-> 4 5-> 5 6-> 6 7-> 7
8-> 8 9-> 9 10-> 10 11-> 11 12-> 12 13-> 13 14-> 14 15-> 15
16-> 16 17-> 17 18-> 18 19-> 19 20-> 20 21-> 21 22-> 22 23-> 23
24-> 24 25-> 25 26-> 26 27-> 27 28-> 28 29-> 29 30-> 30 31-> 31
32-> 32 33-> 33 34-> 34 35-> 35 36-> 36 37-> 37 38-> 38 39-> 39
40-> 40 41-> 41 42-> 42 43-> 43 44-> 44 45-> 45 46-> 46 47-> 47
Input Parameters
================
Problem Size:
Zones: 16 x 16 x 16 (4096 total)
Groups: 1024
Legendre Order: 4
Quadrature Set: Dummy S2 with 96 points
Physical Properties:
Total X-Sec: sigt=[0.100000, 0.000100, 0.100000]
Scattering X-Sec: sigs=[0.050000, 0.000050, 0.050000]
Solver Options:
Number iterations: 10
MPI Decomposition Options:
Total MPI tasks: 2
Spatial decomp: 2 x 1 x 1 MPI tasks
Block solve method: Sweep
Per-Task Options:
DirSets/Directions: 8 sets, 12 directions/set
GroupSet/Groups: 2 sets, 512 groups/set
Zone Sets: 1 x 1 x 1
Architecture: OpenMP
Data Layout: DGZ
Generating Problem
==================
Decomposition Space: Procs: Subdomains (local/global):
--------------------- ---------- --------------------------
(P) Energy: 1 2 / 2
(Q) Direction: 1 8 / 8
(R) Space: 2 1 / 2
(Rx,Ry,Rz) R in XYZ: 2x1x1 1x1x1 / 2x1x1
(PQR) TOTAL: 2 16 / 32
Material Volumes=[8.789062e+03, 1.177734e+05, 2.753438e+06]
Memory breakdown of Field variables:
Field Variable Num Elements Megabytes
-------------- ------------ ---------
data/sigs 15728640 120.000
dx 16 0.000
dy 16 0.000
dz 16 0.000
ell 2400 0.018
ell_plus 2400 0.018
i_plane 25165824 192.000
j_plane 25165824 192.000
k_plane 25165824 192.000
mixelem_to_fraction 4352 0.033
phi 104857600 800.000
phi_out 104857600 800.000
psi 402653184 3072.000
quadrature/w 96 0.001
quadrature/xcos 96 0.001
quadrature/ycos 96 0.001
quadrature/zcos 96 0.001
rhs 402653184 3072.000
sigt_zonal 4194304 32.000
volume 4096 0.031
-------- ------------ ---------
TOTAL 1110455664 8472.104
Generation Complete!
Steady State Solve
==================
iter 0: particle count=1.197998e+09, change=1.000000e+00
iter 1: particle count=1.801368e+09, change=3.349511e-01
iter 2: particle count=2.102278e+09, change=1.431351e-01
iter 3: particle count=2.251810e+09, change=6.640521e-02
iter 4: particle count=2.325888e+09, change=3.184924e-02
iter 5: particle count=2.362467e+09, change=1.548355e-02
iter 6: particle count=2.380471e+09, change=7.563193e-03
iter 7: particle count=2.389305e+09, change=3.697158e-03
iter 8: particle count=2.393627e+09, change=1.805479e-03
iter 9: particle count=2.395735e+09, change=8.801810e-04
Solver terminated
Timers
======
Timer Count Seconds
---------------- ------------ ------------
Generate 1 0.02000
LPlusTimes 10 3.22138
LTimes 10 3.43990
Population 10 0.35467
Scattering 10 51.05737
Solve 1 65.49357
Source 10 0.00215
SweepSolver 10 6.31110
SweepSubdomain 160 1.32933
TIMER_NAMES:Generate,LPlusTimes,LTimes,Population,Scattering,Solve,Source,SweepSolver,SweepSubdomain
TIMER_DATA:0.019997,3.221384,3.439901,0.354669,51.057366,65.493571,0.002153,6.311101,1.329326
Figures of Merit
================
Throughput: 6.147980e+07 [unknowns/(second/iteration)]
Grind time : 1.626550e-08 [(seconds/iteration)/unknowns]
Sweep efficiency : 21.06329 [100.0 * SweepSubdomain time / SweepSolver time]
Number of unknowns: 402653184
END
* Info: Process finished (host idp09.benchmarkcenter.megware.com, process 292706)
* Info: Process finished (host idp09.benchmarkcenter.megware.com, process 292711)
Info: 1/2 lprof instances finished
Your experiment path is /home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_6
To display your profiling results:
#################################################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
#################################################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_6 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_6 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_6 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_6 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_6 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_6 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_6 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/eoseret/qaas_runs_CPU_9468/Kripke_HBM/intel/Kripke/run/oneview_runs/compilers/gcc_6/oneview_results_scal/tools/lprof_npsu_run_6 #
#################################################################################################################################################################################################