* [MAQAO] Info: Detected 1 Lprof instances in igk-0805.
If this is incorrect, rerun with number-processes-per-node=X
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 1 1228045 4904179
executing #MPI = 1 and #OMP = 2
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
=================================================
MUMPS compiled with option -Dmetis
MUMPS compiled with option -Dpord
MUMPS compiled with option -Dptscotch
MUMPS compiled with option -Dscotch
=================================================
L D L^T Solver for general symmetric matrices
Type of parallelism: Working host
****** ANALYSIS STEP ********
Processing a graph of size: 1228045
Average density of rows/columns = 6
Ordering based on METIS
ELAPSED TIME SPENT IN METIS reordering = 7.5860
SYMBOLIC based on column counts
ELAPSED TIME IN symbolic factorization = 0.2708
Leaving analysis phase with ...
INFOG(1) = 0
INFOG(2) = 0
-- (20) Number of entries in factors (estim.) = 66350237
-- (3) Real space for factors (estimated) = 76349627
-- (4) Integer space for factors (estimated) = 8727328
-- (5) Maximum frontal size (estimated) = 1386
-- (6) Number of nodes in the tree = 81870
-- (32) Type of analysis effectively used = 1
-- (7) Ordering option effectively used = 5
ICNTL (6) Maximum transversal option = 0
ICNTL (7) Pivot order option = 7
ICNTL(12) Ordering symmetric indef. matrices = 1
ICNTL(13) Parallelism/splitting of root node = 0
ICNTL(14) Percentage of memory relaxation = 30
ICNTL(15) Analysis by block effectively used = 0
ICNTL(18) Distributed input matrix (on if >0) = 0
ICNTL(32) Forward elimination during facto. = 0
ICNTL(35) BLR activation = 0
ICNTL(48) Tree based multithreading (effective)= 1
ICNTL(58) Symbolic factorization option = 2
Number of level 2 nodes = 0
Number of split nodes = 0
RINFOG(1) Operations during elimination (estim)= 1.488D+10
MEMORY ESTIMATIONS ...
Estimations with standard Full-Rank (FR) factorization:
Total space in MBytes, IC factorization (INFOG(17)): 988
Total space in MBytes, OOC factorization (INFOG(27)): 322
Elapsed time in analysis driver= 8.3943
Analysis time by clock_gettime(): 8.394 s
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 7 1228045 4904179
executing #MPI = 1 and #OMP = 2
Elapsed time in save structure driver= 0.0002
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
On return from DMUMPS, INFOG(1)= -71
On return from DMUMPS, INFOG(2)= 0
PRE FACTO START LPROF----------------------
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 2 1228045 4904179
executing #MPI = 1 and #OMP = 2
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
****** FACTORIZATION STEP ********
GLOBAL STATISTICS PRIOR NUMERICAL FACTORIZATION ...
Number of working processes = 1
ICNTL(22) Out-of-core option = 0
ICNTL(35) BLR activation (eff. choice) = 0
ICNTL(37) BLR CB compression (eff. choice) = 0
ICNTL(49) Compact workarray S (end facto.) = 0
ICNTL(56) Effective value during facto. = 0
ICNTL(14) Memory relaxation = 30
INFOG(3) Real space for factors (estimated)= 76349627
INFOG(4) Integer space for factors (estim.)= 8727328
Maximum frontal size (estimated) = 1386
Number of nodes in the tree = 81870
ICNTL(23) Memory allowed (value on host) = 0
Sum over all procs = 0
Memory provided by user, sum of LWK_USER = 0
Effective threshold for pivoting, CNTL(1) = 0.1000D-01
* [MAQAO] Info: STARTING COUNTERS (igk-0805)
[0m
Statistics on the scaling phase
Elapsed time for scaling = 0.0810
Max difference from 1 after scaling the entries for ONE-NORM (option 7/8) = 0.30D-01
Effective size of S (based on INFO(39))= 49723357
Redistrib: total data local/sent = 0 0
Elapsed time to reformat/distribute matrix = 0.1195
Allocated buffers
------------------
Size of reception buffer in bytes ...... = 230000
Size of async. emission buffer (bytes).. = 230012
Small emission buffer (bytes) .......... = 20
** Memory allocated, total in Mbytes (INFOG(19)): 988
** Memory effectively used, total in Mbytes (INFOG(22)): 896
Flops under L0 layer = 1.117D+09
Elapsed time under L0 = 0.8466
Elapsed time for factorization = 2.0024
Leaving factorization with ...
RINFOG (2) Operations in node assembly = 7.824D+07
------ (3) Operations in node elimination = 1.488D+10
ICNTL (8) Scaling effectively used = 7
INFOG (9) Real space for factors = 76349627
INFOG (10) Integer space for factors = 8727328
INFOG (11) Maximum front size = 1386
INFOG (29) Number of entries in factors = 66350237
INFOG (12) Number of negative pivots = 0
INFOG (13) Number of delayed pivots = 0
Number of 2x2 pivots in type 1 nodes = 0
Number of 2x2 pivots in type 2 nodes = 0
RINFOG(19) Smallest pivot WITH perturbed pivots = 5.885D-02
RINFOG(20) Smallest pivot WITHOUT perturbed pivots = 5.885D-02
RINFOG(21) Largest pivot in absolute value = 1.000D+00
INFOG (24) Effective value of ICNTL(12) = 1
INFOG (14) Number of memory compress = 0
Elapsed time in factorization driver = 2.2155
Factorization time by clock_gettime(): 2.2156 s
Entering DMUMPS 5.8.2 from C interface with JOB = -2
executing #MPI = 1 and #OMP = 2
Your experiment path is /home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_0
To display your profiling results:
################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_0 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_0 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_0 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_0 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_0 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_0 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_0 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_0 #
################################################################################################################################################################
* [MAQAO] Info: Detected 2 Lprof instances in igk-0805.
If this is incorrect, rerun with number-processes-per-node=X
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 1 1228045 4904179
executing #MPI = 2 and #OMP = 2
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
=================================================
MUMPS compiled with option -Dmetis
MUMPS compiled with option -Dpord
MUMPS compiled with option -Dptscotch
MUMPS compiled with option -Dscotch
=================================================
L D L^T Solver for general symmetric matrices
Type of parallelism: Working host
****** ANALYSIS STEP ********
Processing a graph of size: 1228045
Average density of rows/columns = 6
Ordering based on METIS
ELAPSED TIME SPENT IN METIS reordering = 7.5940
SYMBOLIC based on column counts
ELAPSED TIME IN symbolic factorization = 0.2703
A root of estimated size 1382 has been selected for Scalapack.
Leaving analysis phase with ...
INFOG(1) = 0
INFOG(2) = 0
-- (20) Number of entries in factors (estim.) = 66350237
-- (3) Real space for factors (estimated) = 77209064
-- (4) Integer space for factors (estimated) = 8728739
-- (5) Maximum frontal size (estimated) = 1386
-- (6) Number of nodes in the tree = 81870
-- (32) Type of analysis effectively used = 1
-- (7) Ordering option effectively used = 5
ICNTL (6) Maximum transversal option = 0
ICNTL (7) Pivot order option = 7
ICNTL(12) Ordering symmetric indef. matrices = 1
ICNTL(13) Parallelism/splitting of root node = 0
ICNTL(14) Percentage of memory relaxation = 30
ICNTL(15) Analysis by block effectively used = 0
ICNTL(18) Distributed input matrix (on if >0) = 0
ICNTL(32) Forward elimination during facto. = 0
ICNTL(35) BLR activation = 0
ICNTL(48) Tree based multithreading (effective)= 1
ICNTL(58) Symbolic factorization option = 2
Number of level 2 nodes = 1
Number of split nodes = 0
RINFOG(1) Operations during elimination (estim)= 1.575D+10
MEMORY ESTIMATIONS ...
Estimations with standard Full-Rank (FR) factorization:
Maximum estim. space in Mbytes, IC facto. (INFOG(16)): 550
Total space in MBytes, IC factorization (INFOG(17)): 1067
Maximum estim. space in Mbytes, OOC facto. (INFOG(26)): 208
Total space in MBytes, OOC factorization (INFOG(27)): 411
Elapsed time in analysis driver= 8.5295
Analysis time by clock_gettime(): 8.529 s
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 7 1228045 4904179
executing #MPI = 2 and #OMP = 2
Elapsed time in save structure driver= 0.0003
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
On return from DMUMPS, INFOG(1)= -71
On return from DMUMPS, INFOG(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 2 1228045 4904179
executing #MPI = 2 and #OMP = 2
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
****** FACTORIZATION STEP ********
GLOBAL STATISTICS PRIOR NUMERICAL FACTORIZATION ...
Number of working processes = 2
ICNTL(22) Out-of-core option = 0
ICNTL(35) BLR activation (eff. choice) = 0
ICNTL(37) BLR CB compression (eff. choice) = 0
ICNTL(49) Compact workarray S (end facto.) = 0
ICNTL(56) Effective value during facto. = 0
ICNTL(14) Memory relaxation = 30
INFOG(3) Real space for factors (estimated)= 77209064
INFOG(4) Integer space for factors (estim.)= 8728739
Maximum frontal size (estimated) = 1386
Number of nodes in the tree = 81870
ICNTL(23) Memory allowed (value on host) = 0
Sum over all procs = 0
Memory provided by user, sum of LWK_USER = 0
Effective threshold for pivoting, CNTL(1) = 0.1000D-01
* [MAQAO] Info: STARTING COUNTERS (igk-0805)
[0m
Statistics on the scaling phase
Elapsed time for scaling = 0.0844
Max difference from 1 after scaling the entries for ONE-NORM (option 7/8) = 0.30D-01
Average Effective size of S (based on INFO(39))= 26869046
Elapsed time to reformat/distribute matrix = 0.1406
Allocated buffers
------------------
Size of reception buffer in bytes ...... = 1147920
Size of async. emission buffer (bytes).. = 4603161
Small emission buffer (bytes) .......... = 228
** Memory allocated, max in Mbytes (INFOG(18)): 550
** Memory allocated, total in Mbytes (INFOG(19)): 1067
** Memory effectively used, max in Mbytes (INFOG(21)): 494
** Memory effectively used, total in Mbytes (INFOG(22)): 961
Flops under L0 layer (avg/max across MPI) = 5.607D+08 5.809D+08
Elapsed time under L0 (avg/max across MPI) = 0.4445 0.4526
Elapsed time to process root node = 0.0389
Elapsed time for factorization = 1.1148
Leaving factorization with ...
RINFOG (2) Operations in node assembly = 7.821D+07
------ (3) Operations in node elimination = 1.575D+10
ICNTL (8) Scaling effectively used = 7
INFOG (9) Real space for factors = 77209064
INFOG (10) Integer space for factors = 8728753
INFOG (11) Maximum front size = 1386
INFOG (29) Number of entries in factors = 66350237
INFOG (12) Number of negative pivots = 0
INFOG (13) Number of delayed pivots = 0
Number of 2x2 pivots in type 1 nodes = 0
Number of 2x2 pivots in type 2 nodes = 0
RINFOG(19) Smallest pivot WITH perturbed pivots = 5.885D-02
RINFOG(20) Smallest pivot WITHOUT perturbed pivots = 5.885D-02
RINFOG(21) Largest pivot in absolute value = 1.000D+00
INFOG (24) Effective value of ICNTL(12) = 1
INFOG (14) Number of memory compress = 0
Elapsed time in factorization driver = 1.3516
Factorization time by clock_gettime(): 1.3537 s
Entering DMUMPS 5.8.2 from C interface with JOB = -2
executing #MPI = 2 and #OMP = 2
Your experiment path is /home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_1
To display your profiling results:
################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_1 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_1 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_1 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_1 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_1 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_1 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_1 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_1 #
################################################################################################################################################################
* [MAQAO] Info: Detected 4 Lprof instances in igk-0805.
If this is incorrect, rerun with number-processes-per-node=X
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 1 1228045 4904179
executing #MPI = 4 and #OMP = 2
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
=================================================
MUMPS compiled with option -Dmetis
MUMPS compiled with option -Dpord
MUMPS compiled with option -Dptscotch
MUMPS compiled with option -Dscotch
=================================================
L D L^T Solver for general symmetric matrices
Type of parallelism: Working host
****** ANALYSIS STEP ********
Processing a graph of size: 1228045
Average density of rows/columns = 6
Ordering based on METIS
ELAPSED TIME SPENT IN METIS reordering = 7.6108
SYMBOLIC based on column counts
ELAPSED TIME IN symbolic factorization = 0.2676
A root of estimated size 1382 has been selected for Scalapack.
Leaving analysis phase with ...
INFOG(1) = 0
INFOG(2) = 0
-- (20) Number of entries in factors (estim.) = 66350237
-- (3) Real space for factors (estimated) = 77209064
-- (4) Integer space for factors (estimated) = 8736265
-- (5) Maximum frontal size (estimated) = 1386
-- (6) Number of nodes in the tree = 81870
-- (32) Type of analysis effectively used = 1
-- (7) Ordering option effectively used = 5
ICNTL (6) Maximum transversal option = 0
ICNTL (7) Pivot order option = 7
ICNTL(12) Ordering symmetric indef. matrices = 1
ICNTL(13) Parallelism/splitting of root node = 0
ICNTL(14) Percentage of memory relaxation = 30
ICNTL(15) Analysis by block effectively used = 0
ICNTL(18) Distributed input matrix (on if >0) = 0
ICNTL(32) Forward elimination during facto. = 0
ICNTL(35) BLR activation = 0
ICNTL(48) Tree based multithreading (effective)= 1
ICNTL(58) Symbolic factorization option = 2
Number of level 2 nodes = 3
Number of split nodes = 0
RINFOG(1) Operations during elimination (estim)= 1.575D+10
MEMORY ESTIMATIONS ...
Estimations with standard Full-Rank (FR) factorization:
Maximum estim. space in Mbytes, IC facto. (INFOG(16)): 302
Total space in MBytes, IC factorization (INFOG(17)): 1155
Maximum estim. space in Mbytes, OOC facto. (INFOG(26)): 134
Total space in MBytes, OOC factorization (INFOG(27)): 528
Elapsed time in analysis driver= 8.4980
Analysis time by clock_gettime(): 8.498 s
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 7 1228045 4904179
executing #MPI = 4 and #OMP = 2
Elapsed time in save structure driver= 0.0003
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
On return from DMUMPS, INFOG(1)= -71
On return from DMUMPS, INFOG(2)= 0
PRE FACTO START LPROF----------------------
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 2 1228045 4904179
executing #MPI = 4 and #OMP = 2
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
* [MAQAO] Info: STARTING COUNTERS (igk-0805)
****** FACTORIZATION STEP ********
[0m GLOBAL STATISTICS PRIOR NUMERICAL FACTORIZATION ...
Number of working processes = 4
ICNTL(22) Out-of-core option = 0
ICNTL(35) BLR activation (eff. choice) = 0
ICNTL(37) BLR CB compression (eff. choice) = 0
ICNTL(49) Compact workarray S (end facto.) = 0
ICNTL(56) Effective value during facto. = 0
ICNTL(14) Memory relaxation = 30
INFOG(3) Real space for factors (estimated)= 77209064
INFOG(4) Integer space for factors (estim.)= 8736265
Maximum frontal size (estimated) = 1386
Number of nodes in the tree = 81870
ICNTL(23) Memory allowed (value on host) = 0
Sum over all procs = 0
Memory provided by user, sum of LWK_USER = 0
Effective threshold for pivoting, CNTL(1) = 0.1000D-01
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
Statistics on the scaling phase
Elapsed time for scaling = 0.0854
Max difference from 1 after scaling the entries for ONE-NORM (option 7/8) = 0.30D-01
Average Effective size of S (based on INFO(39))= 13838273
Elapsed time to reformat/distribute matrix = 0.1326
Allocated buffers
------------------
Size of reception buffer in bytes ...... = 1147948
Size of async. emission buffer (bytes).. = 4603272
Small emission buffer (bytes) .......... = 700
** Memory allocated, max in Mbytes (INFOG(18)): 302
** Memory allocated, total in Mbytes (INFOG(19)): 1155
** Memory effectively used, max in Mbytes (INFOG(21)): 271
** Memory effectively used, total in Mbytes (INFOG(22)): 1044
Flops under L0 layer (avg/max across MPI) = 2.810D+08 3.041D+08
Elapsed time under L0 (avg/max across MPI) = 0.2129 0.2234
Elapsed time to process root node = 0.0365
Elapsed time for factorization = 0.6289
Leaving factorization with ...
RINFOG (2) Operations in node assembly = 7.865D+07
------ (3) Operations in node elimination = 1.575D+10
ICNTL (8) Scaling effectively used = 7
INFOG (9) Real space for factors = 77209064
INFOG (10) Integer space for factors = 8736016
INFOG (11) Maximum front size = 1386
INFOG (29) Number of entries in factors = 66350237
INFOG (12) Number of negative pivots = 0
INFOG (13) Number of delayed pivots = 0
Number of 2x2 pivots in type 1 nodes = 0
Number of 2x2 pivots in type 2 nodes = 0
RINFOG(19) Smallest pivot WITH perturbed pivots = 5.885D-02
RINFOG(20) Smallest pivot WITHOUT perturbed pivots = 5.885D-02
RINFOG(21) Largest pivot in absolute value = 1.000D+00
INFOG (24) Effective value of ICNTL(12) = 1
INFOG (14) Number of memory compress = 0
Elapsed time in factorization driver = 0.8590
Factorization time by clock_gettime(): 0.8659 s
Entering DMUMPS 5.8.2 from C interface with JOB = -2
executing #MPI = 4 and #OMP = 2
Your experiment path is /home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_2
To display your profiling results:
################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_2 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_2 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_2 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_2 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_2 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_2 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_2 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_2 #
################################################################################################################################################################
* [MAQAO] Info: Detected 8 Lprof instances in igk-0805.
If this is incorrect, rerun with number-processes-per-node=X
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 1 1228045 4904179
executing #MPI = 8 and #OMP = 2
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
=================================================
MUMPS compiled with option -Dmetis
MUMPS compiled with option -Dpord
MUMPS compiled with option -Dptscotch
MUMPS compiled with option -Dscotch
=================================================
L D L^T Solver for general symmetric matrices
Type of parallelism: Working host
****** ANALYSIS STEP ********
Processing a graph of size: 1228045
Average density of rows/columns = 6
Ordering based on METIS
ELAPSED TIME SPENT IN METIS reordering = 7.5961
SYMBOLIC based on column counts
ELAPSED TIME IN symbolic factorization = 0.2695
A root of estimated size 1382 has been selected for Scalapack.
Leaving analysis phase with ...
INFOG(1) = 0
INFOG(2) = 0
-- (20) Number of entries in factors (estim.) = 66350237
-- (3) Real space for factors (estimated) = 77191406
-- (4) Integer space for factors (estimated) = 8744543
-- (5) Maximum frontal size (estimated) = 1386
-- (6) Number of nodes in the tree = 81871
-- (32) Type of analysis effectively used = 1
-- (7) Ordering option effectively used = 5
ICNTL (6) Maximum transversal option = 0
ICNTL (7) Pivot order option = 7
ICNTL(12) Ordering symmetric indef. matrices = 1
ICNTL(13) Parallelism/splitting of root node = 0
ICNTL(14) Percentage of memory relaxation = 30
ICNTL(15) Analysis by block effectively used = 0
ICNTL(18) Distributed input matrix (on if >0) = 0
ICNTL(32) Forward elimination during facto. = 0
ICNTL(35) BLR activation = 0
ICNTL(48) Tree based multithreading (effective)= 1
ICNTL(58) Symbolic factorization option = 2
Number of level 2 nodes = 9
Number of split nodes = 1
RINFOG(1) Operations during elimination (estim)= 1.575D+10
MEMORY ESTIMATIONS ...
Estimations with standard Full-Rank (FR) factorization:
Maximum estim. space in Mbytes, IC facto. (INFOG(16)): 180
Total space in MBytes, IC factorization (INFOG(17)): 1343
Maximum estim. space in Mbytes, OOC facto. (INFOG(26)): 97
Total space in MBytes, OOC factorization (INFOG(27)): 736
Elapsed time in analysis driver= 8.4815
Analysis time by clock_gettime(): 8.482 s
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 7 1228045 4904179
executing #MPI = 8 and #OMP = 2
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
Elapsed time in save structure driver= 0.0005
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
On return from DMUMPS, INFOG(1)= -71
On return from DMUMPS, INFOG(2)= 0
PRE FACTO START LPROF----------------------
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 2 1228045 4904179
executing #MPI = 8 and #OMP = 2
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
****** FACTORIZATION STEP ********
* [MAQAO] Info: STARTING COUNTERS (igk-0805)
[0m GLOBAL STATISTICS PRIOR NUMERICAL FACTORIZATION ...
Number of working processes = 8
ICNTL(22) Out-of-core option = 0
ICNTL(35) BLR activation (eff. choice) = 0
ICNTL(37) BLR CB compression (eff. choice) = 0
ICNTL(49) Compact workarray S (end facto.) = 0
ICNTL(56) Effective value during facto. = 0
ICNTL(14) Memory relaxation = 30
INFOG(3) Real space for factors (estimated)= 77191406
INFOG(4) Integer space for factors (estim.)= 8744543
Maximum frontal size (estimated) = 1386
Number of nodes in the tree = 81871
ICNTL(23) Memory allowed (value on host) = 0
Sum over all procs = 0
Memory provided by user, sum of LWK_USER = 0
Effective threshold for pivoting, CNTL(1) = 0.1000D-01
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
Statistics on the scaling phase
Elapsed time for scaling = 0.0865
Max difference from 1 after scaling the entries for ONE-NORM (option 7/8) = 0.30D-01
Average Effective size of S (based on INFO(39))= 7542764
Elapsed time to reformat/distribute matrix = 0.1301
Allocated buffers
------------------
Size of reception buffer in bytes ...... = 1139504
Size of async. emission buffer (bytes).. = 4569405
Small emission buffer (bytes) .......... = 2164
** Memory allocated, max in Mbytes (INFOG(18)): 179
** Memory allocated, total in Mbytes (INFOG(19)): 1348
** Memory effectively used, max in Mbytes (INFOG(21)): 162
** Memory effectively used, total in Mbytes (INFOG(22)): 1224
Flops under L0 layer (avg/max across MPI) = 1.413D+08 1.539D+08
Elapsed time under L0 (avg/max across MPI) = 0.1112 0.1186
Elapsed time to process root node = 0.0242
Elapsed time for factorization = 0.3454
Leaving factorization with ...
RINFOG (2) Operations in node assembly = 7.941D+07
------ (3) Operations in node elimination = 1.575D+10
ICNTL (8) Scaling effectively used = 7
INFOG (9) Real space for factors = 77193043
INFOG (10) Integer space for factors = 8743714
INFOG (11) Maximum front size = 1386
INFOG (29) Number of entries in factors = 66350237
INFOG (12) Number of negative pivots = 0
INFOG (13) Number of delayed pivots = 0
Number of 2x2 pivots in type 1 nodes = 0
Number of 2x2 pivots in type 2 nodes = 0
RINFOG(19) Smallest pivot WITH perturbed pivots = 5.885D-02
RINFOG(20) Smallest pivot WITHOUT perturbed pivots = 5.885D-02
RINFOG(21) Largest pivot in absolute value = 1.000D+00
INFOG (24) Effective value of ICNTL(12) = 1
INFOG (14) Number of memory compress = 0
Elapsed time in factorization driver = 0.5828
Factorization time by clock_gettime(): 0.5840 s
Entering DMUMPS 5.8.2 from C interface with JOB = -2
executing #MPI = 8 and #OMP = 2
Your experiment path is /home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_3
To display your profiling results:
################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_3 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_3 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_3 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_3 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_3 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_3 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_3 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_3 #
################################################################################################################################################################
* [MAQAO] Info: Detected 16 Lprof instances in igk-0805.
If this is incorrect, rerun with number-processes-per-node=X
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 1 1228045 4904179
executing #MPI = 16 and #OMP = 2
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
=================================================
MUMPS compiled with option -Dmetis
MUMPS compiled with option -Dpord
MUMPS compiled with option -Dptscotch
MUMPS compiled with option -Dscotch
=================================================
L D L^T Solver for general symmetric matrices
Type of parallelism: Working host
****** ANALYSIS STEP ********
Processing a graph of size: 1228045
Average density of rows/columns = 6
Ordering based on METIS
ELAPSED TIME SPENT IN METIS reordering = 7.5825
SYMBOLIC based on column counts
ELAPSED TIME IN symbolic factorization = 0.2657
A root of estimated size 1382 has been selected for Scalapack.
Leaving analysis phase with ...
INFOG(1) = 0
INFOG(2) = 0
-- (20) Number of entries in factors (estim.) = 66350237
-- (3) Real space for factors (estimated) = 77184744
-- (4) Integer space for factors (estimated) = 8771998
-- (5) Maximum frontal size (estimated) = 1386
-- (6) Number of nodes in the tree = 81873
-- (32) Type of analysis effectively used = 1
-- (7) Ordering option effectively used = 5
ICNTL (6) Maximum transversal option = 0
ICNTL (7) Pivot order option = 7
ICNTL(12) Ordering symmetric indef. matrices = 1
ICNTL(13) Parallelism/splitting of root node = 0
ICNTL(14) Percentage of memory relaxation = 30
ICNTL(15) Analysis by block effectively used = 0
ICNTL(18) Distributed input matrix (on if >0) = 0
ICNTL(32) Forward elimination during facto. = 0
ICNTL(35) BLR activation = 0
ICNTL(48) Tree based multithreading (effective)= 1
ICNTL(58) Symbolic factorization option = 2
Number of level 2 nodes = 23
Number of split nodes = 3
RINFOG(1) Operations during elimination (estim)= 1.575D+10
MEMORY ESTIMATIONS ...
Estimations with standard Full-Rank (FR) factorization:
Maximum estim. space in Mbytes, IC facto. (INFOG(16)): 136
Total space in MBytes, IC factorization (INFOG(17)): 1760
Maximum estim. space in Mbytes, OOC facto. (INFOG(26)): 97
Total space in MBytes, OOC factorization (INFOG(27)): 1172
Elapsed time in analysis driver= 8.4470
Analysis time by clock_gettime(): 8.447 s
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 7 1228045 4904179
executing #MPI = 16 and #OMP = 2
Elapsed time in save structure driver= 0.0006
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
On return from DMUMPS, INFOG(1)= -71
On return from DMUMPS, INFOG(2)= 0
PRE FACTO START LPROF----------------------
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 2 1228045 4904179
executing #MPI = 16 and #OMP = 2
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
* [MAQAO] Info: STARTING COUNTERS (igk-0805)
[0m
****** FACTORIZATION STEP ********
GLOBAL STATISTICS PRIOR NUMERICAL FACTORIZATION ...
Number of working processes = 16
ICNTL(22) Out-of-core option = 0
ICNTL(35) BLR activation (eff. choice) = 0
ICNTL(37) BLR CB compression (eff. choice) = 0
ICNTL(49) Compact workarray S (end facto.) = 0
ICNTL(56) Effective value during facto. = 0
ICNTL(14) Memory relaxation = 30
INFOG(3) Real space for factors (estimated)= 77184744
INFOG(4) Integer space for factors (estim.)= 8771998
Maximum frontal size (estimated) = 1386
Number of nodes in the tree = 81873
ICNTL(23) Memory allowed (value on host) = 0
Sum over all procs = 0
Memory provided by user, sum of LWK_USER = 0
Effective threshold for pivoting, CNTL(1) = 0.1000D-01
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
Statistics on the scaling phase
Elapsed time for scaling = 0.0897
Max difference from 1 after scaling the entries for ONE-NORM (option 7/8) = 0.30D-01
Average Effective size of S (based on INFO(39))= 4340604
Elapsed time to reformat/distribute matrix = 0.1293
Allocated buffers
------------------
Size of reception buffer in bytes ...... = 1139616
Size of async. emission buffer (bytes).. = 4569848
Small emission buffer (bytes) .......... = 6924
** Memory allocated, max in Mbytes (INFOG(18)): 139
** Memory allocated, total in Mbytes (INFOG(19)): 1750
** Memory effectively used, max in Mbytes (INFOG(21)): 131
** Memory effectively used, total in Mbytes (INFOG(22)): 1595
Flops under L0 layer (avg/max across MPI) = 8.342D+07 2.013D+08
Elapsed time under L0 (avg/max across MPI) = 0.0633 0.0712
Elapsed time to process root node = 0.0247
Elapsed time for factorization = 0.2473
Leaving factorization with ...
RINFOG (2) Operations in node assembly = 8.025D+07
------ (3) Operations in node elimination = 1.575D+10
ICNTL (8) Scaling effectively used = 7
INFOG (9) Real space for factors = 77190575
INFOG (10) Integer space for factors = 8767917
INFOG (11) Maximum front size = 1386
INFOG (29) Number of entries in factors = 66350237
INFOG (12) Number of negative pivots = 0
INFOG (13) Number of delayed pivots = 0
Number of 2x2 pivots in type 1 nodes = 0
Number of 2x2 pivots in type 2 nodes = 0
RINFOG(19) Smallest pivot WITH perturbed pivots = 5.885D-02
RINFOG(20) Smallest pivot WITHOUT perturbed pivots = 5.885D-02
RINFOG(21) Largest pivot in absolute value = 1.000D+00
INFOG (24) Effective value of ICNTL(12) = 1
INFOG (14) Number of memory compress = 1
Elapsed time in factorization driver = 0.4850
Factorization time by clock_gettime(): 0.4875 s
Entering DMUMPS 5.8.2 from C interface with JOB = -2
executing #MPI = 16 and #OMP = 2
Your experiment path is /home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_4
To display your profiling results:
################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_4 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_4 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_4 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_4 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_4 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_4 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_4 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_4 #
################################################################################################################################################################
* [MAQAO] Info: Detected 32 Lprof instances in igk-0805.
If this is incorrect, rerun with number-processes-per-node=X
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 1 1228045 4904179
executing #MPI = 32 and #OMP = 2
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
=================================================
MUMPS compiled with option -Dmetis
MUMPS compiled with option -Dpord
MUMPS compiled with option -Dptscotch
MUMPS compiled with option -Dscotch
=================================================
L D L^T Solver for general symmetric matrices
Type of parallelism: Working host
****** ANALYSIS STEP ********
Processing a graph of size: 1228045
Average density of rows/columns = 6
Ordering based on METIS
ELAPSED TIME SPENT IN METIS reordering = 7.8327
SYMBOLIC based on column counts
ELAPSED TIME IN symbolic factorization = 0.2977
A root of estimated size 1382 has been selected for Scalapack.
Leaving analysis phase with ...
INFOG(1) = 0
INFOG(2) = 0
-- (20) Number of entries in factors (estim.) = 66350237
-- (3) Real space for factors (estimated) = 77165049
-- (4) Integer space for factors (estimated) = 8852226
-- (5) Maximum frontal size (estimated) = 1386
-- (6) Number of nodes in the tree = 81874
-- (32) Type of analysis effectively used = 1
-- (7) Ordering option effectively used = 5
ICNTL (6) Maximum transversal option = 0
ICNTL (7) Pivot order option = 7
ICNTL(12) Ordering symmetric indef. matrices = 1
ICNTL(13) Parallelism/splitting of root node = 0
ICNTL(14) Percentage of memory relaxation = 30
ICNTL(15) Analysis by block effectively used = 0
ICNTL(18) Distributed input matrix (on if >0) = 0
ICNTL(32) Forward elimination during facto. = 0
ICNTL(35) BLR activation = 0
ICNTL(48) Tree based multithreading (effective)= 1
ICNTL(58) Symbolic factorization option = 2
Number of level 2 nodes = 48
Number of split nodes = 4
RINFOG(1) Operations during elimination (estim)= 1.575D+10
MEMORY ESTIMATIONS ...
Estimations with standard Full-Rank (FR) factorization:
Maximum estim. space in Mbytes, IC facto. (INFOG(16)): 110
Total space in MBytes, IC factorization (INFOG(17)): 2344
Maximum estim. space in Mbytes, OOC facto. (INFOG(26)): 89
Total space in MBytes, OOC factorization (INFOG(27)): 1825
Elapsed time in analysis driver= 8.8465
Analysis time by clock_gettime(): 8.847 s
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 7 1228045 4904179
executing #MPI = 32 and #OMP = 2
Elapsed time in save structure driver= 0.0009
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
On return from DMUMPS, INFOG(1)= -71
On return from DMUMPS, INFOG(2)= 0
PRE FACTO START LPROF----------------------
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 2 1228045 4904179
executing #MPI = 32 and #OMP = 2
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
* [MAQAO] Info: STARTING COUNTERS (igk-0805)
[0m ** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
****** FACTORIZATION STEP ********
GLOBAL STATISTICS PRIOR NUMERICAL FACTORIZATION ...
Number of working processes = 32
ICNTL(22) Out-of-core option = 0
ICNTL(35) BLR activation (eff. choice) = 0
ICNTL(37) BLR CB compression (eff. choice) = 0
ICNTL(49) Compact workarray S (end facto.) = 0
ICNTL(56) Effective value during facto. = 0
ICNTL(14) Memory relaxation = 30
INFOG(3) Real space for factors (estimated)= 77165049
INFOG(4) Integer space for factors (estim.)= 8852226
Maximum frontal size (estimated) = 1386
Number of nodes in the tree = 81874
ICNTL(23) Memory allowed (value on host) = 0
Sum over all procs = 0
Memory provided by user, sum of LWK_USER = 0
Effective threshold for pivoting, CNTL(1) = 0.1000D-01
Statistics on the scaling phase
Elapsed time for scaling = 0.0986
Max difference from 1 after scaling the entries for ONE-NORM (option 7/8) = 0.30D-01
Average Effective size of S (based on INFO(39))= 2171262
Elapsed time to reformat/distribute matrix = 0.1557
Allocated buffers
------------------
Size of reception buffer in bytes ...... = 971676
Size of async. emission buffer (bytes).. = 3896421
Small emission buffer (bytes) .......... = 24064
** Memory allocated, max in Mbytes (INFOG(18)): 112
** Memory allocated, total in Mbytes (INFOG(19)): 2313
** Memory effectively used, max in Mbytes (INFOG(21)): 105
** Memory effectively used, total in Mbytes (INFOG(22)): 2151
Flops under L0 layer (avg/max across MPI) = 3.991D+07 1.013D+08
Elapsed time under L0 (avg/max across MPI) = 0.0402 0.0484
Elapsed time to process root node = 0.0261
Elapsed time for factorization = 0.2088
Leaving factorization with ...
RINFOG (2) Operations in node assembly = 8.021D+07
------ (3) Operations in node elimination = 1.575D+10
ICNTL (8) Scaling effectively used = 7
INFOG (9) Real space for factors = 77188793
INFOG (10) Integer space for factors = 8829008
INFOG (11) Maximum front size = 1386
INFOG (29) Number of entries in factors = 66350237
INFOG (12) Number of negative pivots = 0
INFOG (13) Number of delayed pivots = 0
Number of 2x2 pivots in type 1 nodes = 0
Number of 2x2 pivots in type 2 nodes = 0
RINFOG(19) Smallest pivot WITH perturbed pivots = 5.885D-02
RINFOG(20) Smallest pivot WITHOUT perturbed pivots = 5.885D-02
RINFOG(21) Largest pivot in absolute value = 1.000D+00
INFOG (24) Effective value of ICNTL(12) = 1
INFOG (14) Number of memory compress = 9
Elapsed time in factorization driver = 0.4803
Factorization time by clock_gettime(): 0.4943 s
Entering DMUMPS 5.8.2 from C interface with JOB = -2
executing #MPI = 32 and #OMP = 2
Your experiment path is /home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_5
To display your profiling results:
################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_5 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_5 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_5 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_5 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_5 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_5 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_5 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_5 #
################################################################################################################################################################
* [MAQAO] Info: Detected 64 Lprof instances in igk-0805.
If this is incorrect, rerun with number-processes-per-node=X
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 1 1228045 4904179
executing #MPI = 64 and #OMP = 2
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
=================================================
MUMPS compiled with option -Dmetis
MUMPS compiled with option -Dpord
MUMPS compiled with option -Dptscotch
MUMPS compiled with option -Dscotch
=================================================
L D L^T Solver for general symmetric matrices
Type of parallelism: Working host
****** ANALYSIS STEP ********
Processing a graph of size: 1228045
Average density of rows/columns = 6
Ordering based on METIS
ELAPSED TIME SPENT IN METIS reordering = 7.8061
SYMBOLIC based on column counts
ELAPSED TIME IN symbolic factorization = 0.2948
A root of estimated size 1382 has been selected for Scalapack.
Leaving analysis phase with ...
INFOG(1) = 0
INFOG(2) = 0
-- (20) Number of entries in factors (estim.) = 66350237
-- (3) Real space for factors (estimated) = 77143666
-- (4) Integer space for factors (estimated) = 8964997
-- (5) Maximum frontal size (estimated) = 1386
-- (6) Number of nodes in the tree = 81877
-- (32) Type of analysis effectively used = 1
-- (7) Ordering option effectively used = 5
ICNTL (6) Maximum transversal option = 0
ICNTL (7) Pivot order option = 7
ICNTL(12) Ordering symmetric indef. matrices = 1
ICNTL(13) Parallelism/splitting of root node = 0
ICNTL(14) Percentage of memory relaxation = 30
ICNTL(15) Analysis by block effectively used = 0
ICNTL(18) Distributed input matrix (on if >0) = 0
ICNTL(32) Forward elimination during facto. = 0
ICNTL(35) BLR activation = 0
ICNTL(48) Tree based multithreading (effective)= 1
ICNTL(58) Symbolic factorization option = 2
Number of level 2 nodes = 88
Number of split nodes = 7
RINFOG(1) Operations during elimination (estim)= 1.575D+10
MEMORY ESTIMATIONS ...
Estimations with standard Full-Rank (FR) factorization:
Maximum estim. space in Mbytes, IC facto. (INFOG(16)): 136
Total space in MBytes, IC factorization (INFOG(17)): 3678
Maximum estim. space in Mbytes, OOC facto. (INFOG(26)): 126
Total space in MBytes, OOC factorization (INFOG(27)): 3295
Elapsed time in analysis driver= 8.8300
Analysis time by clock_gettime(): 8.831 s
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 7 1228045 4904179
executing #MPI = 64 and #OMP = 2
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
PRE FACTO START LPROF----------------------
Elapsed time in save structure driver= 0.0014
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
On return from DMUMPS, INFOG(1)= -71
On return from DMUMPS, INFOG(2)= 0
PRE FACTO START LPROF----------------------
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 2 1228045 4904179
executing #MPI = 64 and #OMP = 2
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
* [MAQAO] Info: STARTING COUNTERS (igk-0805)
[0m ** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
****** FACTORIZATION STEP ********
GLOBAL STATISTICS PRIOR NUMERICAL FACTORIZATION ...
Number of working processes = 64
ICNTL(22) Out-of-core option = 0
ICNTL(35) BLR activation (eff. choice) = 0
ICNTL(37) BLR CB compression (eff. choice) = 0
ICNTL(49) Compact workarray S (end facto.) = 0
ICNTL(56) Effective value during facto. = 0
ICNTL(14) Memory relaxation = 30
INFOG(3) Real space for factors (estimated)= 77143666
INFOG(4) Integer space for factors (estim.)= 8964997
Maximum frontal size (estimated) = 1386
Number of nodes in the tree = 81877
ICNTL(23) Memory allowed (value on host) = 0
Sum over all procs = 0
Memory provided by user, sum of LWK_USER = 0
Effective threshold for pivoting, CNTL(1) = 0.1000D-01
Statistics on the scaling phase
Elapsed time for scaling = 0.0891
Max difference from 1 after scaling the entries for ONE-NORM (option 7/8) = 0.30D-01
Average Effective size of S (based on INFO(39))= 1124555
Elapsed time to reformat/distribute matrix = 0.1706
Allocated buffers
------------------
Size of reception buffer in bytes ...... = 972120
Size of async. emission buffer (bytes).. = 3898192
Small emission buffer (bytes) .......... = 88864
** Memory allocated, max in Mbytes (INFOG(18)): 137
** Memory allocated, total in Mbytes (INFOG(19)): 3552
** Memory effectively used, max in Mbytes (INFOG(21)): 132
** Memory effectively used, total in Mbytes (INFOG(22)): 3376
Flops under L0 layer (avg/max across MPI) = 1.964D+07 4.092D+07
Elapsed time under L0 (avg/max across MPI) = 0.0305 0.0357
Elapsed time to process root node = 0.0369
Elapsed time for factorization = 0.1565
Leaving factorization with ...
RINFOG (2) Operations in node assembly = 8.113D+07
------ (3) Operations in node elimination = 1.575D+10
ICNTL (8) Scaling effectively used = 7
INFOG (9) Real space for factors = 77184486
INFOG (10) Integer space for factors = 8915946
INFOG (11) Maximum front size = 1386
INFOG (29) Number of entries in factors = 66350237
INFOG (12) Number of negative pivots = 0
INFOG (13) Number of delayed pivots = 0
Number of 2x2 pivots in type 1 nodes = 0
Number of 2x2 pivots in type 2 nodes = 0
RINFOG(19) Smallest pivot WITH perturbed pivots = 5.885D-02
RINFOG(20) Smallest pivot WITHOUT perturbed pivots = 5.885D-02
RINFOG(21) Largest pivot in absolute value = 1.000D+00
INFOG (24) Effective value of ICNTL(12) = 1
INFOG (14) Number of memory compress = 33
Elapsed time in factorization driver = 0.4420
Factorization time by clock_gettime(): 0.4569 s
Entering DMUMPS 5.8.2 from C interface with JOB = -2
executing #MPI = 64 and #OMP = 2
Your experiment path is /home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_6
To display your profiling results:
################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_6 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_6 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_6 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_6 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_6 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_6 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_6 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_6 #
################################################################################################################################################################
* [MAQAO] Info: Detected 86 Lprof instances in igk-0805.
If this is incorrect, rerun with number-processes-per-node=X
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 1 1228045 4904179
executing #MPI = 86 and #OMP = 2
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
=================================================
MUMPS compiled with option -Dmetis
MUMPS compiled with option -Dpord
MUMPS compiled with option -Dptscotch
MUMPS compiled with option -Dscotch
=================================================
L D L^T Solver for general symmetric matrices
Type of parallelism: Working host
****** ANALYSIS STEP ********
Processing a graph of size: 1228045
Average density of rows/columns = 6
Ordering based on METIS
ELAPSED TIME SPENT IN METIS reordering = 7.8229
SYMBOLIC based on column counts
ELAPSED TIME IN symbolic factorization = 0.3023
A root of estimated size 1382 has been selected for Scalapack.
Leaving analysis phase with ...
INFOG(1) = 0
INFOG(2) = 0
-- (20) Number of entries in factors (estim.) = 66350237
-- (3) Real space for factors (estimated) = 77121001
-- (4) Integer space for factors (estimated) = 9012472
-- (5) Maximum frontal size (estimated) = 1386
-- (6) Number of nodes in the tree = 81877
-- (32) Type of analysis effectively used = 1
-- (7) Ordering option effectively used = 5
ICNTL (6) Maximum transversal option = 0
ICNTL (7) Pivot order option = 7
ICNTL(12) Ordering symmetric indef. matrices = 1
ICNTL(13) Parallelism/splitting of root node = 0
ICNTL(14) Percentage of memory relaxation = 30
ICNTL(15) Analysis by block effectively used = 0
ICNTL(18) Distributed input matrix (on if >0) = 0
ICNTL(32) Forward elimination during facto. = 0
ICNTL(35) BLR activation = 0
ICNTL(48) Tree based multithreading (effective)= 1
ICNTL(58) Symbolic factorization option = 2
Number of level 2 nodes = 95
Number of split nodes = 7
RINFOG(1) Operations during elimination (estim)= 1.575D+10
MEMORY ESTIMATIONS ...
Estimations with standard Full-Rank (FR) factorization:
Maximum estim. space in Mbytes, IC facto. (INFOG(16)): 102
Total space in MBytes, IC factorization (INFOG(17)): 4574
Maximum estim. space in Mbytes, OOC facto. (INFOG(26)): 96
Total space in MBytes, OOC factorization (INFOG(27)): 4292
Elapsed time in analysis driver= 8.8805
Analysis time by clock_gettime(): 8.882 s
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 7 1228045 4904179
executing #MPI = 86 and #OMP = 2
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
PRE FACTO START LPROF----------------------
Elapsed time in save structure driver= 0.0011
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
On return from DMUMPS, INFOG(1)= -71
On return from DMUMPS, INFOG(2)= 0
PRE FACTO START LPROF----------------------
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 2 1228045 4904179
executing #MPI = 86 and #OMP = 2
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
* [MAQAO] Info: STARTING COUNTERS (igk-0805)
[0m ** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
****** FACTORIZATION STEP ********
GLOBAL STATISTICS PRIOR NUMERICAL FACTORIZATION ...
Number of working processes = 86
ICNTL(22) Out-of-core option = 0
ICNTL(35) BLR activation (eff. choice) = 0
ICNTL(37) BLR CB compression (eff. choice) = 0
ICNTL(49) Compact workarray S (end facto.) = 0
ICNTL(56) Effective value during facto. = 0
ICNTL(14) Memory relaxation = 30
INFOG(3) Real space for factors (estimated)= 77121001
INFOG(4) Integer space for factors (estim.)= 9012472
Maximum frontal size (estimated) = 1386
Number of nodes in the tree = 81877
ICNTL(23) Memory allowed (value on host) = 0
Sum over all procs = 0
Memory provided by user, sum of LWK_USER = 0
Effective threshold for pivoting, CNTL(1) = 0.1000D-01
Statistics on the scaling phase
Elapsed time for scaling = 0.0923
Max difference from 1 after scaling the entries for ONE-NORM (option 7/8) = 0.30D-01
Average Effective size of S (based on INFO(39))= 865073
Elapsed time to reformat/distribute matrix = 0.1887
Allocated buffers
------------------
Size of reception buffer in bytes ...... = 972424
Size of async. emission buffer (bytes).. = 3899409
Small emission buffer (bytes) .......... = 156764
** Memory allocated, max in Mbytes (INFOG(18)): 105
** Memory allocated, total in Mbytes (INFOG(19)): 4392
** Memory effectively used, max in Mbytes (INFOG(21)): 104
** Memory effectively used, total in Mbytes (INFOG(22)): 4250
Flops under L0 layer (avg/max across MPI) = 1.519D+07 2.660D+07
Elapsed time under L0 (avg/max across MPI) = 0.0338 0.0388
Elapsed time to process root node = 0.0323
Elapsed time for factorization = 0.1793
Leaving factorization with ...
RINFOG (2) Operations in node assembly = 8.107D+07
------ (3) Operations in node elimination = 1.575D+10
ICNTL (8) Scaling effectively used = 7
INFOG (9) Real space for factors = 77184486
INFOG (10) Integer space for factors = 8954272
INFOG (11) Maximum front size = 1386
INFOG (29) Number of entries in factors = 66350237
INFOG (12) Number of negative pivots = 0
INFOG (13) Number of delayed pivots = 0
Number of 2x2 pivots in type 1 nodes = 0
Number of 2x2 pivots in type 2 nodes = 0
RINFOG(19) Smallest pivot WITH perturbed pivots = 5.885D-02
RINFOG(20) Smallest pivot WITHOUT perturbed pivots = 5.885D-02
RINFOG(21) Largest pivot in absolute value = 1.000D+00
INFOG (24) Effective value of ICNTL(12) = 1
INFOG (14) Number of memory compress = 38
Elapsed time in factorization driver = 0.4907
Factorization time by clock_gettime(): 0.5136 s
Entering DMUMPS 5.8.2 from C interface with JOB = -2
executing #MPI = 86 and #OMP = 2
Your experiment path is /home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_7
To display your profiling results:
################################################################################################################################################################
# LEVEL | REPORT | COMMAND #
################################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_7 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_7 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_7 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_7 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_7 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_7 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_7 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_thermal2_allowextra_scala_kptr_probe/tools/lprof_run_7 #
################################################################################################################################################################