OV - AVBP_V7_dev.KRAKEN_A100

Executable Output


* [MAQAO] Info: Detected 1 Lprof instances in krakenngpu01.cluster. 
If this is incorrect, rerun with number-processes-per-node=X
[0m      __      ______  _____   __      ________  _
     /     / /  _ |  __        / /____  || |
    /     / /| |_) | |__) |     / /    / /_| | _____   __
   / /  / / |  _ <|  ___/     / /    / / _` |/ _   / /
  / ____   /  | |_) | |           /    / / (_| |  __/ V /
 /_/    _/   |____/|_|          /    /_/ __,_|___| _/  


Using branch  :
Version date  : Mon, 18 Nov 2024 11:40:50 +0100
Commit        : b58af1ea20

MPI processes : 1

Computation #1/1

Compilation info :  mpif90 -g -Mpreprocess -O3 -fastsse -Munroll -byteswapio -tp=px -acc=gpu -Minfo=accel -Minline -I/softs/local_pgi/phdf5/1.8.20_pgi204_zen/include -DHAS_PMETIS -I/softs/local_pgi/parmetis/403_r64_pgi201_px/include 

Compilation wrapper info : nvfortran -I/home/logiciels/nvidia/hpc_sdk/Linux_x86_64/24.1/comm_libs/12.3/hpcx/hpcx-2.17.1/ompi/include -I/home/logiciels/nvidia/hpc_sdk/Linux_x86_64/24.1/comm_libs/12.3/hpcx/hpcx-2.17.1/ompi/lib -L/home/logiciels/nvidia/hpc_sdk/Linux_x86_64/24.1/comm_libs/12.3/hpcx/hpcx-2.17.1/ompi/lib -rpath /home/logiciels/nvidia/hpc_sdk/Linux_x86_64/24.1/comm_libs/12.3/hpcx/hpcx-2.17.1/ompi/lib -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi

Compilation user :  neto 

Compilation date :  2025-03-06 13:53:10 

Compilation MPI version : mpirun (Open MPI) 4.1.7a1

AVBP version : 7.15.0

Reading input file version : 7.15.0

 ----> Reading run parameters : .//run.params
 ----> Using NATURAL reordering

 
 >>>>> WARNING
 >>>>>  Use of cached el2part disabled

 
 ----> command.dat file API is enabled

 ----> GPU DIRECT ENABLED

 ----> Using TTGC
       with UNCLOSED boundary terms

 ----> Using colin_species viscosity model


 
 >>>>> WARNING
 >>>>> Temporals are not computed!
 
 
 >>>>> WARNING
 >>>>> Specifier 'transport = computed' is deprecated since version 7.14. Please use 'transport = simplified' instead (simplified transport model).
 
 ----> Reading mesh : .//../MESH/config5_uniform_true.mesh.h5

       Meshfile signature: fd2a16d9e6abb6308ccb0c380607da34


 ----> Initialize the solution writers (1 writers)

 
 >>>>> WARNING
 >>>>> No instantaneous solution storage required: the calculation of additional variables is deactivated.
 
       Checking TFLES table parameters...
       Local optimal thickening is applied with   5.00 cells in the flame front.
 ----> Reading boundary conditions in asciibound file : .//../MESH/config5_uniform_true.asciiBound.key
        _______________________________________________________________________________________
       | Boundary patches (no reordering)                                                     |
       |______________________________________________________________________________________|
       | Patch number   Patch name                         Boundary condition                 |
       | ------------   ----------                         ------------------                 |
       | 1              plenum_back                        OUTLET_RELAX_P_3D                  |
       | 2              plenum_left                        OUTLET_RELAX_P_3D                  |
       | 3              plenum_top                         OUTLET_RELAX_P_3D                  |
       | 4              plenum_right                       OUTLET_RELAX_P_3D                  |
       | 5              plenum_bottom                      OUTLET_RELAX_P_3D                  |
       | 6              plenum_fr                          WALL_NOSLIP_ADIAB                  |
       | 7              chamber_bottom                     WALL_NOSLIP_ADIAB                  |
       | 8              chamber_left                       WALL_NOSLIP_ADIAB                  |
       | 9              chamber_top                        WALL_NOSLIP_ADIAB                  |
       | 10             chamber_right                      WALL_NOSLIP_ADIAB                  |
       | 11             chamber_fr                         WALL_NOSLIP_ADIAB                  |
       | 12             grid11_fr                          WALL_NOSLIP_ADIAB                  |
       | 13             grid11_ba                          WALL_NOSLIP_ADIAB                  |
       | 14             grid12_fr                          WALL_NOSLIP_ADIAB                  |
       | 15             grid12_ba                          WALL_NOSLIP_ADIAB                  |
       | 16             grid13_fr                          WALL_NOSLIP_ADIAB                  |
       | 17             grid13_ba                          WALL_NOSLIP_ADIAB                  |
       | 18             grid14_fr                          WALL_NOSLIP_ADIAB                  |
       | 19             grid14_ba                          WALL_NOSLIP_ADIAB                  |
       | 20             grid15_fr                          WALL_NOSLIP_ADIAB                  |
       | 21             grid15_ba                          WALL_NOSLIP_ADIAB                  |
       | 22             obtsacle_fr                        WALL_NOSLIP_ADIAB                  |
       | 23             obtsacle_ba                        WALL_NOSLIP_ADIAB                  |
       |______________________________________________________________________________________|
        ______________________________________________________________
       | Info on initial grid                                        |
       |_____________________________________________________________|
       | number of dimensions              : 3                       |
       | number of nodes                   : 3543424                 |
       | number of cells                   : 20083754                |
       | - tetrahedra                      : 20083754                |
       | number of cell per group          : 1000000                 |
       | number of boundary nodes          : 243230                  |
       | number of periodic nodes          : 0                       |
       | number of axi-periodic nodes      : 0                       |
       |_____________________________________________________________|
       | After partitioning                                          |
       |_____________________________________________________________|
       | number of nodes                   : 3543424                 |
       | extra nodes due to partitioning   : 0 [+   0.00‰]           |
       |_____________________________________________________________|
        ______________________________________________________________
       | Partitioning Quality                                        |
       |_____________________________________________________________|
       | Maximum number of neighbors       :             0.00        |
       | Average number of neighbors       :             0.00        |
       | Maximum number of exchange nodes  :             0.00        |
       | Average number of exchange nodes  :             0.00        |
       |_____________________________________________________________|


 ----> Reading initial solution : .//../MESH/init.h5
 ----> Reading took   1.687s
        ______________________________________________________________
       | Info on chemistry                                           |
       |_____________________________________________________________|
       | Kinetic scheme : C3H8_F2                                    |
       |                                                             |
       | Chemical reaction #1                                        |
       | Preexponential / fthick [SI] :  2.38433847E+09              |
       | Activation temperature [K]   :  2.08840191E+04              |
       |                                                             |
       | Chemical reaction #2                                        |
       | Preexponential / fthick [SI] :  4.50000000E+07              |
       | Activation temperature [K]   :  1.00645875E+04              |
       |_____________________________________________________________|
        ______________________________________________________________
       | Info on initial solution                                    |
       |_____________________________________________________________|
       | number of Navier-Stokes equations :  5                      |
       | number of species                 :  6                      |
       | number of reactions               :  2                      |
       | number of tpf equations           :  0                      |
       | number of fictive species         :  0                      |
       | initial iteration                 :  93387                  |
       | initial time                      :  9.00005232E-03         |
       |_____________________________________________________________|


 ----> Reading solutbound : .//../MESH/perso.solutBound.h5
     - Using 6.X format

 ----> Reading took   0.010s

 ----> Initialising metrics

 ----> Total volume of the mesh [m3] :  1.35006148E+01
 ----> Smallest cell volume [m3]     :  1.48598105E-12
 ----> Found cached wall distance computation. Checking: ./ywall.h5
     > Signatures match

 ----> Reading cached wall distance computation: ./ywall.h5
 ----> Reading took   0.172s

 ----> Boundary MPIs: 1


 ----> End pre-processing.


 ________________________________________________________________________________________________________

 ***** GPU memory (used/total):  22707 MB /  24051 MB | Cell per group: 1000000



 ----> Starts the temporal loop.


 ***** GPU memory (used/total):  23004 MB /  24051 MB | Cell per group: 1000000



 ----> End computation.


 ________________________________________________________________________________________________________
        ____________________________________________________________________________________________
       | 1 MPI tasks with GPU     Elapsed real time [s]       [s.cores]      [h.cores]             |
       |___________________________________________________________________________________________|
       | AVBP                   :      1195.84               1.1958E+03     3.3218E-01             |
       | Temporal loop          :       309.07               3.0907E+02     8.5854E-02             |
       | Per simulated second   :   3.4250E+07               3.4250E+07     9.5139E+03             |
       | Per iteration          :       3.0907               3.0907E+00                            |
       |-------------------------------------------------------------------------------------------|
       | RCT  [s.mpi/node/it]   :   8.72244536E-07                                                 |
       |___________________________________________________________________________________________|

 ----> Initial physical time   :  9.00005232E-03
       Initial iteration       :  93387
       Initial timestep        :  9.02395769E-08

 ----> Final physical time     :  9.00907632E-03
       Final iteration         :  93487
       Final timestep          :  9.02440465E-08

 ----> Simulated physical time :  9.02399946E-06
       Simulated iterations    :  100

 ________________________________________________________________________________________________________

       TIMERS
 ________________________________________________________________________________________________________

       Prints relevant timers and breaks down percentage regarding reference timers.

       > The 'Total slave simulation' time corresponds to the 1st level, and is measured by slave_timer (sum of pre temporal loop, temporal loop and post temporal loop).
       > The 'Computation' time corresponds to the time integration loops, and is measured by rungekutta_timer.
       > Levels are depicted using [X.Y.Z. ...] lists. The number of entry in the list corresponds to the level.
       > References to the upper level is made to compute the contribution of one sub-level to its parent level.
       > The times displayed are those of the master processor.
       > For each timer, the minimum, maximum and mean values for all processors are also shown in the 3 right-hand columns.
       > A json file 'timers.json' containing all the data is also available in the temporal output directory.


       ----- 1st level timers
                                                                                  time [s]   |  relative to                                     [   min [s]      mean [s]      max [s]   ]
                                                                                             | tot. slave [%]                                   [                                        ]
        >  [0] Total slave simulation :                                          1.1958E+03  |      100.00%                                     [  1.1958E+03   1.1958E+03   1.1958E+03  ]


       ----- 2nd level timers
                                                                                  time [s]   |  relative to                                     [   min [s]      mean [s]      max [s]   ]
                                                                                             | tot. slave [%]                                   [                                        ]
        > >  [0.1] Pre temporal loop :                                           8.8673E+02  |       74.15%                                     [  8.8673E+02   8.8673E+02   8.8673E+02  ]
        > >  [0.2] Temporal loop :                                               3.0907E+02  |       25.85%                                     [  3.0907E+02   3.0907E+02   3.0907E+02  ]
        > >  [0.2a] Temporal loop without IO :                                   3.0907E+02  |       25.85%                                     [  3.0907E+02   3.0907E+02   3.0907E+02  ]
        > >  [0.3] Post temporal loop :                                          3.7884E-02  |        0.00%                                     [  3.7884E-02   3.7884E-02   3.7884E-02  ]
        > >  [0.4] Point to Point communications :                               1.4293E-01  |        0.01%                                     [  1.4293E-01   1.4293E-01   1.4293E-01  ]


       ----- 3rd level timers
                                                                                  time [s]   |  relative to   |  relative to                    [   min [s]      mean [s]      max [s]   ]
                                                                                             | tot. slave [%] | upper level [%]                 [                                        ]
        > >  [0.1] Pre temporal loop :                                           8.8673E+02  |       74.15%                                     [  8.8673E+02   8.8673E+02   8.8673E+02  ]
        > > >  [0.1.1] Build online postprocessing objects :                     0.0000E+00  |        0.00%   |        0.00%                    [  0.0000E+00   0.0000E+00   0.0000E+00  ]
        > >  [0.2] Temporal loop :                                               3.0907E+02  |       25.85%                                     [  3.0907E+02   3.0907E+02   3.0907E+02  ]
        > > >  [0.2.1] Computation :                                             3.0841E+02  |       25.79%   |       99.79%                    [  3.0841E+02   3.0841E+02   3.0841E+02  ]
        > > >  [0.2.2] Temporal post-processing :                                0.0000E+00  |        0.00%   |        0.00%                    [  0.0000E+00   0.0000E+00   0.0000E+00  ]
        > > >  [0.2.3] Instantaneous solution post-processing :                  0.0000E+00  |        0.00%   |        0.00%                    [  0.0000E+00   0.0000E+00   0.0000E+00  ]
        > > >  [0.2.4] Average solution post-processing :                        0.0000E+00  |        0.00%   |        0.00%                    [  0.0000E+00   0.0000E+00   0.0000E+00  ]
        > > >  [0.2.5] Online post-processing compute and storage :              0.0000E+00  |        0.00%   |        0.00%                    [  0.0000E+00   0.0000E+00   0.0000E+00  ]


       ----- 4th level timers: focus on Computation level (rungekutta_timer)
                                                                                  time [s]   |  relative to   |  relative to   |  relative to   [   min [s]      mean [s]      max [s]   ]
                                                                                             | tot. slave [%] | computation [%]| upper level [%][                                        ]
        > > >  [0.2.1] Computation :                                             3.0841E+02  |       25.79%   |      100.00%                    [  3.0841E+02   3.0841E+02   3.0841E+02  ]
        > > > >  [0.2.1.1] Convective scheme :                                   7.4225E+01  |        6.21%   |       24.07%   |       24.07%   [  7.4225E+01   7.4225E+01   7.4225E+01  ]
        > > > >  [0.2.1.2] Diffusion operator :                                  8.9274E+01  |        7.47%   |       28.95%   |       28.95%   [  8.9274E+01   8.9274E+01   8.9274E+01  ]
        > > > >  [0.2.1.4] Time-step calculation :                               4.8618E+00  |        0.41%   |        1.58%   |        1.58%   [  4.8618E+00   4.8618E+00   4.8618E+00  ]
        > > > >  [0.2.1.5] Transport calculation :                               8.2592E-01  |        0.07%   |        0.27%   |        0.27%   [  8.2592E-01   8.2592E-01   8.2592E-01  ]
        > > > >  [0.2.1.6] Thermo calculation :                                  1.4672E+00  |        0.12%   |        0.48%   |        0.48%   [  1.4672E+00   1.4672E+00   1.4672E+00  ]
        > > > >  [0.2.1.7] Gradient calculation :                                2.9224E+01  |        2.44%   |        9.48%   |        9.48%   [  2.9224E+01   2.9224E+01   2.9224E+01  ]
        > > > >  [0.2.1.8] Boundary :                                            3.6475E+00  |        0.31%   |        1.18%   |        1.18%   [  3.6475E+00   3.6475E+00   3.6475E+00  ]
        > > > >  [0.2.1.9] Turbulent viscosity model :                           4.7548E+00  |        0.40%   |        1.54%   |        1.54%   [  4.7548E+00   4.7548E+00   4.7548E+00  ]
        > > > >  [0.2.1.10] Combustion (source term + TFLES + efcy + efcy I0) :  1.5014E+01  |        1.26%   |        4.87%   |        4.87%   [  1.5014E+01   1.5014E+01   1.5014E+01  ]
        > > > > >  [0.2.1.10.1] Chemical source terms calculation :              6.2687E+00  |        0.52%   |        2.03%   |       41.75%   [  6.2687E+00   6.2687E+00   6.2687E+00  ]
        > > > > >  [0.2.1.10.2] TFLES model calculation :                        5.2677E+00  |        0.44%   |        1.71%   |       35.08%   [  5.2677E+00   5.2677E+00   5.2677E+00  ]
        > > > > >  [0.2.1.10.3] Efficiency function calculation :                3.4779E+00  |        0.29%   |        1.13%   |       23.16%   [  3.4779E+00   3.4779E+00   3.4779E+00  ]
        > > > > >  [0.2.1.10.4] Efficiency I0 function calculation :             0.0000E+00  |        0.00%   |        0.00%   |        0.00%   [  0.0000E+00   0.0000E+00   0.0000E+00  ]
        > > > >  [0.2.1.11] Artificial viscosity :                               3.2180E+01  |        2.69%   |       10.43%   |       10.43%   [  3.2180E+01   3.2180E+01   3.2180E+01  ]
        > > > >  [0.2.1.17] Source terms :                                       0.0000E+00  |        0.00%   |        0.00%   |        0.00%   [  0.0000E+00   0.0000E+00   0.0000E+00  ]


 ----> End of AVBP session

 ----> Found 4 warning messages for this computation, check your output file!


 
 ***** Memory usage (system):  Max: 17819.121 MB (rank:0)  Min: 17819.121 MB (rank:0)  Ave: 17819.121 MB  Std: 0.000 MB 
 

 ***** Maximum memory (mod_alloc) : 2129081512 B ( 2.030450E+03 MB)



Your experiment path is /home/exter/neto/LEFEX_20M/RUN/maqao_2025-06-12_15-19-48/tools/lprof_npsu_run_0

To display your profiling results:
###########################################################################################################################################
#    LEVEL    |     REPORT     |                                                 COMMAND                                                  #
###########################################################################################################################################
#  Functions  |  Cluster-wide  |  maqao lprof -df xp=/home/exter/neto/LEFEX_20M/RUN/maqao_2025-06-12_15-19-48/tools/lprof_npsu_run_0      #
#  Functions  |  Per-node      |  maqao lprof -df -dn xp=/home/exter/neto/LEFEX_20M/RUN/maqao_2025-06-12_15-19-48/tools/lprof_npsu_run_0  #
#  Functions  |  Per-process   |  maqao lprof -df -dp xp=/home/exter/neto/LEFEX_20M/RUN/maqao_2025-06-12_15-19-48/tools/lprof_npsu_run_0  #
#  Functions  |  Per-thread    |  maqao lprof -df -dt xp=/home/exter/neto/LEFEX_20M/RUN/maqao_2025-06-12_15-19-48/tools/lprof_npsu_run_0  #
#  Loops      |  Cluster-wide  |  maqao lprof -dl xp=/home/exter/neto/LEFEX_20M/RUN/maqao_2025-06-12_15-19-48/tools/lprof_npsu_run_0      #
#  Loops      |  Per-node      |  maqao lprof -dl -dn xp=/home/exter/neto/LEFEX_20M/RUN/maqao_2025-06-12_15-19-48/tools/lprof_npsu_run_0  #
#  Loops      |  Per-process   |  maqao lprof -dl -dp xp=/home/exter/neto/LEFEX_20M/RUN/maqao_2025-06-12_15-19-48/tools/lprof_npsu_run_0  #
#  Loops      |  Per-thread    |  maqao lprof -dl -dt xp=/home/exter/neto/LEFEX_20M/RUN/maqao_2025-06-12_15-19-48/tools/lprof_npsu_run_0  #
###########################################################################################################################################
Report Configuration

Executable Output