options

Executable Output


* [MAQAO] Info: Detected 1 Lprof instances in krakenngpu08.cluster. 
If this is incorrect, rerun with number-processes-per-node=X
      __      ______  _____   __      ________  _
     /     / /  _ |  __        / /____  || |
    /     / /| |_) | |__) |     / /    / /_| | _____   __
   / /  / / |  _ <|  ___/     / /    / / _` |/ _   / /
  / ____   /  | |_) | |           /    / / (_| |  __/ V /
 /_/    _/   |____/|_|          /    /_/ __,_|___| _/  


Using branch  :
Version date  : Mon, 18 Nov 2024 11:40:50 +0100
Commit        : b58af1ea20

MPI processes : 1

Computation #1/1

Compilation info :  mpif90 -g -Mpreprocess -O3 -fastsse -Munroll -byteswapio -tp=px -acc=gpu -Minfo=accel -Minline -I/softs/local_pgi/phdf5/1.8.20_pgi204_zen/include -DHAS_PMETIS -I/softs/local_pgi/parmetis/403_r64_pgi201_px/include 

Compilation wrapper info : nvfortran -I/home/logiciels/nvidia/hpc_sdk/Linux_x86_64/24.1/comm_libs/12.3/hpcx/hpcx-2.17.1/ompi/include -I/home/logiciels/nvidia/hpc_sdk/Linux_x86_64/24.1/comm_libs/12.3/hpcx/hpcx-2.17.1/ompi/lib -L/home/logiciels/nvidia/hpc_sdk/Linux_x86_64/24.1/comm_libs/12.3/hpcx/hpcx-2.17.1/ompi/lib -rpath /home/logiciels/nvidia/hpc_sdk/Linux_x86_64/24.1/comm_libs/12.3/hpcx/hpcx-2.17.1/ompi/lib -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi

Compilation user :  neto 

Compilation date :  2025-03-06 13:53:10 

Compilation MPI version : mpirun (Open MPI) 4.1.7a1

AVBP version : 7.15.0

Reading input file version : 7.15.0

 ----> Reading run parameters : .//run.params
 ----> Using NATURAL reordering

 ----> command.dat file API is enabled

 ----> GPU DIRECT ENABLED

 ----> Using TTGC
       with UNCLOSED boundary terms

 ----> Using colin viscosity model
 
 >>>>> WARNING
 >>>>> No solution storage required: additional variables are deactivated!
 


 
 >>>>> WARNING
 >>>>> Temporals are not computed!
 
 ----> Reading mesh : .//../Mesh/Bench_simple.mesh.h5

       Meshfile signature: e736f2c3c25f98227815b524d9568f9e


 ----> Initialize the solution writers (1 writers)

 
 >>>>> WARNING
 >>>>> No instantaneous solution storage required: the calculation of additional variables is deactivated.
 
       Checking TFLES table parameters...
       Flame thickening is applied with fthick =  17.00
 ----> Reading boundary conditions in asciibound file : .//../Mesh/Bench_simple.asciiBound.key
        _______________________________________________________________________________________
       | Boundary patches (no reordering)                                                     |
       |______________________________________________________________________________________|
       | Patch number   Patch name                         Boundary condition                 |
       | ------------   ----------                         ------------------                 |
       | 1              INLET                              INLET_RELAX_UVW_T_Y                |
       | 2              OUTLET                             OUTLET_RELAX_P                     |
       | 3              WALL                               WALL_NOSLIP_ADIAB                  |
       |______________________________________________________________________________________|
        ______________________________________________________________
       | Info on initial grid                                        |
       |_____________________________________________________________|
       | number of dimensions              : 3                       |
       | number of nodes                   : 514475                  |
       | number of cells                   : 2958592                 |
       | - tetrahedra                      : 2958592                 |
       | number of cell per group          : 500000                  |
       | number of boundary nodes          : 48048                   |
       | number of periodic nodes          : 0                       |
       | number of axi-periodic nodes      : 0                       |
       |_____________________________________________________________|
       | After partitioning                                          |
       |_____________________________________________________________|
       | number of nodes                   : 514475                  |
       | extra nodes due to partitioning   : 0 [+   0.00‰]           |
       |_____________________________________________________________|
        ______________________________________________________________
       | Partitioning Quality                                        |
       |_____________________________________________________________|
       | Maximum number of neighbors       :             0.00        |
       | Average number of neighbors       :             0.00        |
       | Maximum number of exchange nodes  :             0.00        |
       | Average number of exchange nodes  :             0.00        |
       |_____________________________________________________________|


 ----> Reading initial solution : .//../Mesh/Bench_simple.h5
 ----> Reading took   0.157s
        ______________________________________________________________
       | Info on chemistry                                           |
       |_____________________________________________________________|
       | Kinetic scheme : CH4-AIR-2S-CM2_FLAMMABLE                   |
       | Validity range : 300K/1bar                                  |
       |                                                             |
       | Chemical reaction #1                                        |
       | fthick                       :  1.70000000E+01              |
       | Preexponential / fthick [SI] :  2.00000000E+09              |
       | Activation temperature [K]   :  1.76130282E+04              |
       |                                                             |
       | Chemical reaction #2                                        |
       | fthick                       :  1.70000000E+01              |
       | Preexponential / fthick [SI] :  2.00000000E+06              |
       | Activation temperature [K]   :  6.03875251E+03              |
       |_____________________________________________________________|
        ______________________________________________________________
       | Info on initial solution                                    |
       |_____________________________________________________________|
       | number of Navier-Stokes equations :  5                      |
       | number of species                 :  6                      |
       | number of reactions               :  2                      |
       | number of tpf equations           :  0                      |
       | number of fictive species         :  0                      |
       | initial iteration                 :  0                      |
       | initial time                      :  0.00000000E+00         |
       |_____________________________________________________________|


 ----> Reading solutbound : .//../Mesh/Bench_simple.solutBound.h5
     - Using 6.X format

 ----> Reading took   0.010s

 ----> Initialising metrics

 ----> Total volume of the mesh [m3] :  1.44077204E-01
 ----> Smallest cell volume [m3]     :  2.97811897E-11
 ----> Found cached wall distance computation. Checking: ./ywall.h5
     > Signatures match

 ----> Reading cached wall distance computation: ./ywall.h5
 ----> Reading took   0.003s

 ----> Boundary MPIs: 1


 ----> End pre-processing.


 ________________________________________________________________________________________________________

 ***** GPU memory (used/total):   7497 MB /  24051 MB | Cell per group: 500000



 ----> Starts the temporal loop.


 ***** GPU memory (used/total):   7651 MB /  24051 MB | Cell per group: 500000



 ----> End computation.


 ________________________________________________________________________________________________________
        ____________________________________________________________________________________________
       | 1 MPI tasks with GPU     Elapsed real time [s]       [s.cores]      [h.cores]             |
       |___________________________________________________________________________________________|
       | AVBP                   :       111.84               1.1184E+02     3.1066E-02             |
       | Temporal loop          :        90.80               9.0803E+01     2.5223E-02             |
       | Per simulated second   :   1.8374E+06               1.8374E+06     5.1039E+02             |
       | Per iteration          :       0.4540               4.5401E-01                            |
       |-------------------------------------------------------------------------------------------|
       | RCT  [s.mpi/node/it]   :   8.82479782E-07                                                 |
       |___________________________________________________________________________________________|

 ----> Initial physical time   :  0.00000000E+00
       Initial iteration       :  0
       Initial timestep        :  2.46481884E-07

 ----> Final physical time     :  4.94188563E-05
       Final iteration         :  200
       Final timestep          :  2.47527782E-07

 ----> Simulated physical time :  4.94188563E-05
       Simulated iterations    :  200

 ________________________________________________________________________________________________________

       TIMERS
 ________________________________________________________________________________________________________

       Prints relevant timers and breaks down percentage regarding reference timers.

       > The 'Total slave simulation' time corresponds to the 1st level, and is measured by slave_timer (sum of pre temporal loop, temporal loop and post temporal loop).
       > The 'Computation' time corresponds to the time integration loops, and is measured by rungekutta_timer.
       > Levels are depicted using [X.Y.Z. ...] lists. The number of entry in the list corresponds to the level.
       > References to the upper level is made to compute the contribution of one sub-level to its parent level.
       > The times displayed are those of the master processor.
       > For each timer, the minimum, maximum and mean values for all processors are also shown in the 3 right-hand columns.
       > A json file 'timers.json' containing all the data is also available in the temporal output directory.


       ----- 1st level timers
                                                                                  time [s]   |  relative to                                     [   min [s]      mean [s]      max [s]   ]
                                                                                             | tot. slave [%]                                   [                                        ]
        >  [0] Total slave simulation :                                          1.1184E+02  |      100.00%                                     [  1.1184E+02   1.1184E+02   1.1184E+02  ]


       ----- 2nd level timers
                                                                                  time [s]   |  relative to                                     [   min [s]      mean [s]      max [s]   ]
                                                                                             | tot. slave [%]                                   [                                        ]
        > >  [0.1] Pre temporal loop :                                           2.1027E+01  |       18.80%                                     [  2.1027E+01   2.1027E+01   2.1027E+01  ]
        > >  [0.2] Temporal loop :                                               9.0803E+01  |       81.19%                                     [  9.0803E+01   9.0803E+01   9.0803E+01  ]
        > >  [0.2a] Temporal loop without IO :                                   9.0803E+01  |       81.19%                                     [  9.0803E+01   9.0803E+01   9.0803E+01  ]
        > >  [0.3] Post temporal loop :                                          8.2618E-03  |        0.01%                                     [  8.2618E-03   8.2618E-03   8.2618E-03  ]
        > >  [0.4] Point to Point communications :                               2.6444E-01  |        0.24%                                     [  2.6444E-01   2.6444E-01   2.6444E-01  ]


       ----- 3rd level timers
                                                                                  time [s]   |  relative to   |  relative to                    [   min [s]      mean [s]      max [s]   ]
                                                                                             | tot. slave [%] | upper level [%]                 [                                        ]
        > >  [0.1] Pre temporal loop :                                           2.1027E+01  |       18.80%                                     [  2.1027E+01   2.1027E+01   2.1027E+01  ]
        > > >  [0.1.1] Build online postprocessing objects :                     0.0000E+00  |        0.00%   |        0.00%                    [  0.0000E+00   0.0000E+00   0.0000E+00  ]
        > >  [0.2] Temporal loop :                                               9.0803E+01  |       81.19%                                     [  9.0803E+01   9.0803E+01   9.0803E+01  ]
        > > >  [0.2.1] Computation :                                             9.0706E+01  |       81.10%   |       99.89%                    [  9.0706E+01   9.0706E+01   9.0706E+01  ]
        > > >  [0.2.2] Temporal post-processing :                                0.0000E+00  |        0.00%   |        0.00%                    [  0.0000E+00   0.0000E+00   0.0000E+00  ]
        > > >  [0.2.3] Instantaneous solution post-processing :                  0.0000E+00  |        0.00%   |        0.00%                    [  0.0000E+00   0.0000E+00   0.0000E+00  ]
        > > >  [0.2.4] Average solution post-processing :                        0.0000E+00  |        0.00%   |        0.00%                    [  0.0000E+00   0.0000E+00   0.0000E+00  ]
        > > >  [0.2.5] Online post-processing compute and storage :              0.0000E+00  |        0.00%   |        0.00%                    [  0.0000E+00   0.0000E+00   0.0000E+00  ]


       ----- 4th level timers: focus on Computation level (rungekutta_timer)
                                                                                  time [s]   |  relative to   |  relative to   |  relative to   [   min [s]      mean [s]      max [s]   ]
                                                                                             | tot. slave [%] | computation [%]| upper level [%][                                        ]
        > > >  [0.2.1] Computation :                                             9.0706E+01  |       81.10%   |      100.00%                    [  9.0706E+01   9.0706E+01   9.0706E+01  ]
        > > > >  [0.2.1.1] Convective scheme :                                   2.1806E+01  |       19.50%   |       24.04%   |       24.04%   [  2.1806E+01   2.1806E+01   2.1806E+01  ]
        > > > >  [0.2.1.2] Diffusion operator :                                  2.6680E+01  |       23.86%   |       29.41%   |       29.41%   [  2.6680E+01   2.6680E+01   2.6680E+01  ]
        > > > >  [0.2.1.4] Time-step calculation :                               1.5597E+00  |        1.39%   |        1.72%   |        1.72%   [  1.5597E+00   1.5597E+00   1.5597E+00  ]
        > > > >  [0.2.1.5] Transport calculation :                               2.7525E-01  |        0.25%   |        0.30%   |        0.30%   [  2.7525E-01   2.7525E-01   2.7525E-01  ]
        > > > >  [0.2.1.6] Thermo calculation :                                  4.3679E-01  |        0.39%   |        0.48%   |        0.48%   [  4.3679E-01   4.3679E-01   4.3679E-01  ]
        > > > >  [0.2.1.7] Gradient calculation :                                8.6364E+00  |        7.72%   |        9.52%   |        9.52%   [  8.6364E+00   8.6364E+00   8.6364E+00  ]
        > > > >  [0.2.1.8] Boundary :                                            1.3180E+00  |        1.18%   |        1.45%   |        1.45%   [  1.3180E+00   1.3180E+00   1.3180E+00  ]
        > > > >  [0.2.1.9] Turbulent viscosity model :                           1.4860E+00  |        1.33%   |        1.64%   |        1.64%   [  1.4860E+00   1.4860E+00   1.4860E+00  ]
        > > > >  [0.2.1.10] Combustion (source term + TFLES + efcy + efcy I0) :  2.9503E+00  |        2.64%   |        3.25%   |        3.25%   [  2.9503E+00   2.9503E+00   2.9503E+00  ]
        > > > > >  [0.2.1.10.1] Chemical source terms calculation :              1.8450E+00  |        1.65%   |        2.03%   |       62.54%   [  1.8450E+00   1.8450E+00   1.8450E+00  ]
        > > > > >  [0.2.1.10.2] TFLES model calculation :                        0.0000E+00  |        0.00%   |        0.00%   |        0.00%   [  0.0000E+00   0.0000E+00   0.0000E+00  ]
        > > > > >  [0.2.1.10.3] Efficiency function calculation :                1.1052E+00  |        0.99%   |        1.22%   |       37.46%   [  1.1052E+00   1.1052E+00   1.1052E+00  ]
        > > > > >  [0.2.1.10.4] Efficiency I0 function calculation :             0.0000E+00  |        0.00%   |        0.00%   |        0.00%   [  0.0000E+00   0.0000E+00   0.0000E+00  ]
        > > > >  [0.2.1.11] Artificial viscosity :                               9.6428E+00  |        8.62%   |       10.63%   |       10.63%   [  9.6428E+00   9.6428E+00   9.6428E+00  ]
        > > > >  [0.2.1.17] Source terms :                                       0.0000E+00  |        0.00%   |        0.00%   |        0.00%   [  0.0000E+00   0.0000E+00   0.0000E+00  ]


 ----> End of AVBP session

 ----> Found 3 warning messages for this computation, check your output file!


 
 ***** Memory usage (system):  Max: 3526.609 MB (rank:0)  Min: 3526.609 MB (rank:0)  Ave: 3526.609 MB  Std: 0.000 MB 
 

 ***** Maximum memory (mod_alloc) : 2145712836 B ( 2.046311E+03 MB)



Your experiment path is /home/exter/neto/SIMPLE/Run/maqao_2025-06-12_17-33-43/tools/lprof_npsu_run_0

To display your profiling results:
########################################################################################################################################
#    LEVEL    |     REPORT     |                                                COMMAND                                                #
########################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/exter/neto/SIMPLE/Run/maqao_2025-06-12_17-33-43/tools/lprof_npsu_run_0      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/exter/neto/SIMPLE/Run/maqao_2025-06-12_17-33-43/tools/lprof_npsu_run_0  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/exter/neto/SIMPLE/Run/maqao_2025-06-12_17-33-43/tools/lprof_npsu_run_0  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/exter/neto/SIMPLE/Run/maqao_2025-06-12_17-33-43/tools/lprof_npsu_run_0  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/exter/neto/SIMPLE/Run/maqao_2025-06-12_17-33-43/tools/lprof_npsu_run_0      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/exter/neto/SIMPLE/Run/maqao_2025-06-12_17-33-43/tools/lprof_npsu_run_0  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/exter/neto/SIMPLE/Run/maqao_2025-06-12_17-33-43/tools/lprof_npsu_run_0  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/exter/neto/SIMPLE/Run/maqao_2025-06-12_17-33-43/tools/lprof_npsu_run_0  #
########################################################################################################################################

×