* [MAQAO] Info: Detected 1 Lprof instances in igk-0805.
If this is incorrect, rerun with number-processes-per-node=X
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 1 1000000 2998000
executing #MPI = 1 and #OMP = 2
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
=================================================
MUMPS compiled with option -Dmetis
MUMPS compiled with option -Dpord
MUMPS compiled with option -Dptscotch
MUMPS compiled with option -Dscotch
=================================================
L D L^T Solver for general symmetric matrices
Type of parallelism: Working host
****** ANALYSIS STEP ********
Processing a graph of size: 1000000
Average density of rows/columns = 4
Ordering based on METIS
ELAPSED TIME SPENT IN METIS reordering = 4.5701
SYMBOLIC based on column counts
ELAPSED TIME IN symbolic factorization = 0.1190
Leaving analysis phase with ...
INFOG(1) = 0
INFOG(2) = 0
-- (20) Number of entries in factors (estim.) = 48473287
-- (3) Real space for factors (estimated) = 56478575
-- (4) Integer space for factors (estimated) = 6459886
-- (5) Maximum frontal size (estimated) = 1666
-- (6) Number of nodes in the tree = 66031
-- (32) Type of analysis effectively used = 1
-- (7) Ordering option effectively used = 5
ICNTL (6) Maximum transversal option = 0
ICNTL (7) Pivot order option = 7
ICNTL(12) Ordering symmetric indef. matrices = 1
ICNTL(13) Parallelism/splitting of root node = 0
ICNTL(14) Percentage of memory relaxation = 30
ICNTL(15) Analysis by block effectively used = 0
ICNTL(18) Distributed input matrix (on if >0) = 0
ICNTL(32) Forward elimination during facto. = 0
ICNTL(35) BLR activation = 0
ICNTL(48) Tree based multithreading (effective)= 1
ICNTL(58) Symbolic factorization option = 2
Number of level 2 nodes = 0
Number of split nodes = 0
RINFOG(1) Operations during elimination (estim)= 1.410D+10
MEMORY ESTIMATIONS ...
Estimations with standard Full-Rank (FR) factorization:
Total space in MBytes, IC factorization (INFOG(17)): 728
Total space in MBytes, OOC factorization (INFOG(27)): 251
Elapsed time in analysis driver= 4.9641
Analysis time by clock_gettime(): 4.964 s
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 7 1000000 2998000
executing #MPI = 1 and #OMP = 2
Elapsed time in save structure driver= 0.0002
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
On return from DMUMPS, INFOG(1)= -71
On return from DMUMPS, INFOG(2)= 0
PRE FACTO START LPROF----------------------
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 2 1000000 2998000
executing #MPI = 1 and #OMP = 2
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
****** FACTORIZATION STEP ********
GLOBAL STATISTICS PRIOR NUMERICAL FACTORIZATION ...
Number of working processes = 1
ICNTL(22) Out-of-core option = 0
ICNTL(35) BLR activation (eff. choice) = 0
ICNTL(37) BLR CB compression (eff. choice) = 0
ICNTL(49) Compact workarray S (end facto.) = 0
ICNTL(56) Effective value during facto. = 0
ICNTL(14) Memory relaxation = 30
INFOG(3) Real space for factors (estimated)= 56478575
INFOG(4) Integer space for factors (estim.)= 6459886
Maximum frontal size (estimated) = 1666
Number of nodes in the tree = 66031
ICNTL(23) Memory allowed (value on host) = 0
Sum over all procs = 0
Memory provided by user, sum of LWK_USER = 0
Effective threshold for pivoting, CNTL(1) = 0.1000D-01
* [MAQAO] Info: STARTING COUNTERS (igk-0805)
[0m
Statistics on the scaling phase
Elapsed time for scaling = 0.0452
Max difference from 1 after scaling the entries for ONE-NORM (option 7/8) = 0.81D-02
Effective size of S (based on INFO(39))= 37512721
Redistrib: total data local/sent = 0 0
Elapsed time to reformat/distribute matrix = 0.0496
Allocated buffers
------------------
Size of reception buffer in bytes ...... = 230000
Size of async. emission buffer (bytes).. = 230012
Small emission buffer (bytes) .......... = 20
** Memory allocated, total in Mbytes (INFOG(19)): 728
** Memory effectively used, total in Mbytes (INFOG(22)): 658
Flops under L0 layer = 7.011D+08
Elapsed time under L0 = 0.7065
Elapsed time for factorization = 1.5812
Leaving factorization with ...
RINFOG (2) Operations in node assembly = 5.050D+07
------ (3) Operations in node elimination = 1.410D+10
ICNTL (8) Scaling effectively used = 7
INFOG (9) Real space for factors = 56478575
INFOG (10) Integer space for factors = 6459886
INFOG (11) Maximum front size = 1666
INFOG (29) Number of entries in factors = 48473287
INFOG (12) Number of negative pivots = 0
INFOG (13) Number of delayed pivots = 0
Number of 2x2 pivots in type 1 nodes = 0
Number of 2x2 pivots in type 2 nodes = 0
RINFOG(19) Smallest pivot WITH perturbed pivots = 4.173D-12
RINFOG(20) Smallest pivot WITHOUT perturbed pivots = 4.173D-12
RINFOG(21) Largest pivot in absolute value = 6.207D-01
INFOG (24) Effective value of ICNTL(12) = 1
INFOG (14) Number of memory compress = 0
Elapsed time in factorization driver = 1.6867
Factorization time by clock_gettime(): 1.6868 s
Entering DMUMPS 5.8.2 from C interface with JOB = -2
executing #MPI = 1 and #OMP = 2
Your experiment path is /home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_0
To display your profiling results:
###############################################################################################################################################################
# LEVEL | REPORT | COMMAND #
###############################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_0 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_0 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_0 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_0 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_0 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_0 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_0 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_0 #
###############################################################################################################################################################
* [MAQAO] Info: Detected 2 Lprof instances in igk-0805.
If this is incorrect, rerun with number-processes-per-node=X
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 1 1000000 2998000
executing #MPI = 2 and #OMP = 2
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
=================================================
MUMPS compiled with option -Dmetis
MUMPS compiled with option -Dpord
MUMPS compiled with option -Dptscotch
MUMPS compiled with option -Dscotch
=================================================
L D L^T Solver for general symmetric matrices
Type of parallelism: Working host
****** ANALYSIS STEP ********
Processing a graph of size: 1000000
Average density of rows/columns = 4
Ordering based on METIS
ELAPSED TIME SPENT IN METIS reordering = 4.5527
SYMBOLIC based on column counts
ELAPSED TIME IN symbolic factorization = 0.1171
A root of estimated size 1486 has been selected for Scalapack.
Leaving analysis phase with ...
INFOG(1) = 0
INFOG(2) = 0
-- (20) Number of entries in factors (estim.) = 48473287
-- (3) Real space for factors (estimated) = 57472256
-- (4) Integer space for factors (estimated) = 6461347
-- (5) Maximum frontal size (estimated) = 1666
-- (6) Number of nodes in the tree = 66031
-- (32) Type of analysis effectively used = 1
-- (7) Ordering option effectively used = 5
ICNTL (6) Maximum transversal option = 0
ICNTL (7) Pivot order option = 7
ICNTL(12) Ordering symmetric indef. matrices = 1
ICNTL(13) Parallelism/splitting of root node = 0
ICNTL(14) Percentage of memory relaxation = 30
ICNTL(15) Analysis by block effectively used = 0
ICNTL(18) Distributed input matrix (on if >0) = 0
ICNTL(32) Forward elimination during facto. = 0
ICNTL(35) BLR activation = 0
ICNTL(48) Tree based multithreading (effective)= 1
ICNTL(58) Symbolic factorization option = 2
Number of level 2 nodes = 1
Number of split nodes = 0
RINFOG(1) Operations during elimination (estim)= 1.519D+10
MEMORY ESTIMATIONS ...
Estimations with standard Full-Rank (FR) factorization:
Maximum estim. space in Mbytes, IC facto. (INFOG(16)): 410
Total space in MBytes, IC factorization (INFOG(17)): 820
Maximum estim. space in Mbytes, OOC facto. (INFOG(26)): 183
Total space in MBytes, OOC factorization (INFOG(27)): 360
Elapsed time in analysis driver= 5.0428
Analysis time by clock_gettime(): 5.043 s
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 7 1000000 2998000
executing #MPI = 2 and #OMP = 2
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
Elapsed time in save structure driver= 0.0003
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
On return from DMUMPS, INFOG(1)= -71
On return from DMUMPS, INFOG(2)= 0
PRE FACTO START LPROF----------------------
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 2 1000000 2998000
executing #MPI = 2 and #OMP = 2
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
****** FACTORIZATION STEP ********
GLOBAL STATISTICS PRIOR NUMERICAL FACTORIZATION ...
Number of working processes = 2
ICNTL(22) Out-of-core option = 0
ICNTL(35) BLR activation (eff. choice) = 0
ICNTL(37) BLR CB compression (eff. choice) = 0
ICNTL(49) Compact workarray S (end facto.) = 0
ICNTL(56) Effective value during facto. = 0
ICNTL(14) Memory relaxation = 30
INFOG(3) Real space for factors (estimated)= 57472256
INFOG(4) Integer space for factors (estim.)= 6461347
Maximum frontal size (estimated) = 1666
Number of nodes in the tree = 66031
ICNTL(23) Memory allowed (value on host) = 0
Sum over all procs = 0
Memory provided by user, sum of LWK_USER = 0
Effective threshold for pivoting, CNTL(1) = 0.1000D-01
* [MAQAO] Info: STARTING COUNTERS (igk-0805)
[0m
Statistics on the scaling phase
Elapsed time for scaling = 0.0437
Max difference from 1 after scaling the entries for ONE-NORM (option 7/8) = 0.81D-02
Average Effective size of S (based on INFO(39))= 21914424
Elapsed time to reformat/distribute matrix = 0.0502
Allocated buffers
------------------
Size of reception buffer in bytes ...... = 1237604
Size of async. emission buffer (bytes).. = 4962784
Small emission buffer (bytes) .......... = 212
** Memory allocated, max in Mbytes (INFOG(18)): 410
** Memory allocated, total in Mbytes (INFOG(19)): 820
** Memory effectively used, max in Mbytes (INFOG(21)): 363
** Memory effectively used, total in Mbytes (INFOG(22)): 726
Flops under L0 layer (avg/max across MPI) = 3.506D+08 3.513D+08
Elapsed time under L0 (avg/max across MPI) = 0.3218 0.3289
Elapsed time to process root node = 0.0451
Elapsed time for factorization = 0.8443
Leaving factorization with ...
RINFOG (2) Operations in node assembly = 5.053D+07
------ (3) Operations in node elimination = 1.519D+10
ICNTL (8) Scaling effectively used = 7
INFOG (9) Real space for factors = 57472256
INFOG (10) Integer space for factors = 6461361
INFOG (11) Maximum front size = 1666
INFOG (29) Number of entries in factors = 48473287
INFOG (12) Number of negative pivots = 0
INFOG (13) Number of delayed pivots = 0
Number of 2x2 pivots in type 1 nodes = 0
Number of 2x2 pivots in type 2 nodes = 0
RINFOG(19) Smallest pivot WITH perturbed pivots = 4.181D-12
RINFOG(20) Smallest pivot WITHOUT perturbed pivots = 4.181D-12
RINFOG(21) Largest pivot in absolute value = 6.207D-01
INFOG (24) Effective value of ICNTL(12) = 1
INFOG (14) Number of memory compress = 0
Elapsed time in factorization driver = 0.9482
Factorization time by clock_gettime(): 0.9484 s
Entering DMUMPS 5.8.2 from C interface with JOB = -2
executing #MPI = 2 and #OMP = 2
Your experiment path is /home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_1
To display your profiling results:
###############################################################################################################################################################
# LEVEL | REPORT | COMMAND #
###############################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_1 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_1 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_1 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_1 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_1 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_1 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_1 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_1 #
###############################################################################################################################################################
* [MAQAO] Info: Detected 4 Lprof instances in igk-0805.
If this is incorrect, rerun with number-processes-per-node=X
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 1 1000000 2998000
executing #MPI = 4 and #OMP = 2
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
=================================================
MUMPS compiled with option -Dmetis
MUMPS compiled with option -Dpord
MUMPS compiled with option -Dptscotch
MUMPS compiled with option -Dscotch
=================================================
L D L^T Solver for general symmetric matrices
Type of parallelism: Working host
****** ANALYSIS STEP ********
Processing a graph of size: 1000000
Average density of rows/columns = 4
Ordering based on METIS
ELAPSED TIME SPENT IN METIS reordering = 4.5600
SYMBOLIC based on column counts
ELAPSED TIME IN symbolic factorization = 0.1204
A root of estimated size 1486 has been selected for Scalapack.
Leaving analysis phase with ...
INFOG(1) = 0
INFOG(2) = 0
-- (20) Number of entries in factors (estim.) = 48473287
-- (3) Real space for factors (estimated) = 57469869
-- (4) Integer space for factors (estimated) = 6476434
-- (5) Maximum frontal size (estimated) = 1666
-- (6) Number of nodes in the tree = 66031
-- (32) Type of analysis effectively used = 1
-- (7) Ordering option effectively used = 5
ICNTL (6) Maximum transversal option = 0
ICNTL (7) Pivot order option = 7
ICNTL(12) Ordering symmetric indef. matrices = 1
ICNTL(13) Parallelism/splitting of root node = 0
ICNTL(14) Percentage of memory relaxation = 30
ICNTL(15) Analysis by block effectively used = 0
ICNTL(18) Distributed input matrix (on if >0) = 0
ICNTL(32) Forward elimination during facto. = 0
ICNTL(35) BLR activation = 0
ICNTL(48) Tree based multithreading (effective)= 1
ICNTL(58) Symbolic factorization option = 2
Number of level 2 nodes = 5
Number of split nodes = 0
RINFOG(1) Operations during elimination (estim)= 1.519D+10
MEMORY ESTIMATIONS ...
Estimations with standard Full-Rank (FR) factorization:
Maximum estim. space in Mbytes, IC facto. (INFOG(16)): 229
Total space in MBytes, IC factorization (INFOG(17)): 866
Maximum estim. space in Mbytes, OOC facto. (INFOG(26)): 111
Total space in MBytes, OOC factorization (INFOG(27)): 422
Elapsed time in analysis driver= 5.0502
Analysis time by clock_gettime(): 5.050 s
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 7 1000000 2998000
executing #MPI = 4 and #OMP = 2
Elapsed time in save structure driver= 0.0003
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
On return from DMUMPS, INFOG(1)= -71
On return from DMUMPS, INFOG(2)= 0
PRE FACTO START LPROF----------------------
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 2 1000000 2998000
executing #MPI = 4 and #OMP = 2
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
****** FACTORIZATION STEP ********
GLOBAL STATISTICS PRIOR NUMERICAL FACTORIZATION ...
Number of working processes = 4
ICNTL(22) Out-of-core option = 0
ICNTL(35) BLR activation (eff. choice) = 0
ICNTL(37) BLR CB compression (eff. choice) = 0
ICNTL(49) Compact workarray S (end facto.) = 0
ICNTL(56) Effective value during facto. = 0
ICNTL(14) Memory relaxation = 30
INFOG(3) Real space for factors (estimated)= 57469869
INFOG(4) Integer space for factors (estim.)= 6476434
Maximum frontal size (estimated) = 1666
Number of nodes in the tree = 66031
ICNTL(23) Memory allowed (value on host) = 0
Sum over all procs = 0
Memory provided by user, sum of LWK_USER = 0
* [MAQAO] Info: STARTING COUNTERS (igk-0805)
Effective threshold for pivoting, CNTL(1) = 0.1000D-01
[0m ** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
Statistics on the scaling phase
Elapsed time for scaling = 0.0449
Max difference from 1 after scaling the entries for ONE-NORM (option 7/8) = 0.81D-02
Average Effective size of S (based on INFO(39))= 10565630
Elapsed time to reformat/distribute matrix = 0.0634
Allocated buffers
------------------
Size of reception buffer in bytes ...... = 979976
Size of async. emission buffer (bytes).. = 3929698
Small emission buffer (bytes) .......... = 724
** Memory allocated, max in Mbytes (INFOG(18)): 229
** Memory allocated, total in Mbytes (INFOG(19)): 866
** Memory effectively used, max in Mbytes (INFOG(21)): 205
** Memory effectively used, total in Mbytes (INFOG(22)): 778
Flops under L0 layer (avg/max across MPI) = 1.759D+08 1.847D+08
Elapsed time under L0 (avg/max across MPI) = 0.1653 0.1682
Elapsed time to process root node = 0.0426
Elapsed time for factorization = 0.5127
Leaving factorization with ...
RINFOG (2) Operations in node assembly = 5.051D+07
------ (3) Operations in node elimination = 1.519D+10
ICNTL (8) Scaling effectively used = 7
INFOG (9) Real space for factors = 57472256
INFOG (10) Integer space for factors = 6474425
INFOG (11) Maximum front size = 1666
INFOG (29) Number of entries in factors = 48473287
INFOG (12) Number of negative pivots = 0
INFOG (13) Number of delayed pivots = 0
Number of 2x2 pivots in type 1 nodes = 0
Number of 2x2 pivots in type 2 nodes = 0
RINFOG(19) Smallest pivot WITH perturbed pivots = 4.183D-12
RINFOG(20) Smallest pivot WITHOUT perturbed pivots = 4.183D-12
RINFOG(21) Largest pivot in absolute value = 6.207D-01
INFOG (24) Effective value of ICNTL(12) = 1
INFOG (14) Number of memory compress = 0
Elapsed time in factorization driver = 0.6309
Factorization time by clock_gettime(): 0.6371 s
Entering DMUMPS 5.8.2 from C interface with JOB = -2
executing #MPI = 4 and #OMP = 2
Your experiment path is /home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_2
To display your profiling results:
###############################################################################################################################################################
# LEVEL | REPORT | COMMAND #
###############################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_2 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_2 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_2 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_2 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_2 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_2 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_2 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_2 #
###############################################################################################################################################################
* [MAQAO] Info: Detected 8 Lprof instances in igk-0805.
If this is incorrect, rerun with number-processes-per-node=X
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 1 1000000 2998000
executing #MPI = 8 and #OMP = 2
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
=================================================
MUMPS compiled with option -Dmetis
MUMPS compiled with option -Dpord
MUMPS compiled with option -Dptscotch
MUMPS compiled with option -Dscotch
=================================================
L D L^T Solver for general symmetric matrices
Type of parallelism: Working host
****** ANALYSIS STEP ********
Processing a graph of size: 1000000
Average density of rows/columns = 4
Ordering based on METIS
ELAPSED TIME SPENT IN METIS reordering = 4.5514
SYMBOLIC based on column counts
ELAPSED TIME IN symbolic factorization = 0.1184
A root of estimated size 1486 has been selected for Scalapack.
Leaving analysis phase with ...
INFOG(1) = 0
INFOG(2) = 0
-- (20) Number of entries in factors (estim.) = 48473287
-- (3) Real space for factors (estimated) = 57470631
-- (4) Integer space for factors (estimated) = 6477611
-- (5) Maximum frontal size (estimated) = 1666
-- (6) Number of nodes in the tree = 66031
-- (32) Type of analysis effectively used = 1
-- (7) Ordering option effectively used = 5
ICNTL (6) Maximum transversal option = 0
ICNTL (7) Pivot order option = 7
ICNTL(12) Ordering symmetric indef. matrices = 1
ICNTL(13) Parallelism/splitting of root node = 0
ICNTL(14) Percentage of memory relaxation = 30
ICNTL(15) Analysis by block effectively used = 0
ICNTL(18) Distributed input matrix (on if >0) = 0
ICNTL(32) Forward elimination during facto. = 0
ICNTL(35) BLR activation = 0
ICNTL(48) Tree based multithreading (effective)= 1
ICNTL(58) Symbolic factorization option = 2
Number of level 2 nodes = 8
Number of split nodes = 0
RINFOG(1) Operations during elimination (estim)= 1.519D+10
MEMORY ESTIMATIONS ...
Estimations with standard Full-Rank (FR) factorization:
Maximum estim. space in Mbytes, IC facto. (INFOG(16)): 143
Total space in MBytes, IC factorization (INFOG(17)): 1061
Maximum estim. space in Mbytes, OOC facto. (INFOG(26)): 89
Total space in MBytes, OOC factorization (INFOG(27)): 646
Elapsed time in analysis driver= 5.0356
Analysis time by clock_gettime(): 5.036 s
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 7 1000000 2998000
executing #MPI = 8 and #OMP = 2
Elapsed time in save structure driver= 0.0003
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
On return from DMUMPS, INFOG(1)= -71
On return from DMUMPS, INFOG(2)= 0
PRE FACTO START LPROF----------------------
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 2 1000000 2998000
executing #MPI = 8 and #OMP = 2
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
****** FACTORIZATION STEP ********
* [MAQAO] Info: STARTING COUNTERS (igk-0805)
GLOBAL STATISTICS PRIOR NUMERICAL FACTORIZATION ...
Number of working processes = 8
ICNTL(22) Out-of-core option = 0
[0m ICNTL(35) BLR activation (eff. choice) = 0
ICNTL(37) BLR CB compression (eff. choice) = 0
ICNTL(49) Compact workarray S (end facto.) = 0
ICNTL(56) Effective value during facto. = 0
ICNTL(14) Memory relaxation = 30
INFOG(3) Real space for factors (estimated)= 57470631
INFOG(4) Integer space for factors (estim.)= 6477611
Maximum frontal size (estimated) = 1666
Number of nodes in the tree = 66031
ICNTL(23) Memory allowed (value on host) = 0
Sum over all procs = 0
Memory provided by user, sum of LWK_USER = 0
Effective threshold for pivoting, CNTL(1) = 0.1000D-01
Statistics on the scaling phase
Elapsed time for scaling = 0.0466
Max difference from 1 after scaling the entries for ONE-NORM (option 7/8) = 0.81D-02
Average Effective size of S (based on INFO(39))= 6202692
Elapsed time to reformat/distribute matrix = 0.0580
Allocated buffers
------------------
Size of reception buffer in bytes ...... = 1265948
Size of async. emission buffer (bytes).. = 5076449
Small emission buffer (bytes) .......... = 2000
** Memory allocated, max in Mbytes (INFOG(18)): 148
** Memory allocated, total in Mbytes (INFOG(19)): 1064
** Memory effectively used, max in Mbytes (INFOG(21)): 132
** Memory effectively used, total in Mbytes (INFOG(22)): 964
Flops under L0 layer (avg/max across MPI) = 8.763D+07 8.924D+07
Elapsed time under L0 (avg/max across MPI) = 0.0843 0.0900
Elapsed time to process root node = 0.0285
Elapsed time for factorization = 0.2865
Leaving factorization with ...
RINFOG (2) Operations in node assembly = 5.129D+07
------ (3) Operations in node elimination = 1.519D+10
ICNTL (8) Scaling effectively used = 7
INFOG (9) Real space for factors = 57472256
INFOG (10) Integer space for factors = 6475741
INFOG (11) Maximum front size = 1666
INFOG (29) Number of entries in factors = 48473287
INFOG (12) Number of negative pivots = 0
INFOG (13) Number of delayed pivots = 0
Number of 2x2 pivots in type 1 nodes = 0
Number of 2x2 pivots in type 2 nodes = 0
RINFOG(19) Smallest pivot WITH perturbed pivots = 4.182D-12
RINFOG(20) Smallest pivot WITHOUT perturbed pivots = 4.182D-12
RINFOG(21) Largest pivot in absolute value = 6.207D-01
INFOG (24) Effective value of ICNTL(12) = 1
INFOG (14) Number of memory compress = 0
Elapsed time in factorization driver = 0.4090
Factorization time by clock_gettime(): 0.4094 s
Entering DMUMPS 5.8.2 from C interface with JOB = -2
executing #MPI = 8 and #OMP = 2
Your experiment path is /home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_3
To display your profiling results:
###############################################################################################################################################################
# LEVEL | REPORT | COMMAND #
###############################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_3 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_3 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_3 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_3 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_3 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_3 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_3 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_3 #
###############################################################################################################################################################
* [MAQAO] Info: Detected 16 Lprof instances in igk-0805.
If this is incorrect, rerun with number-processes-per-node=X
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 1 1000000 2998000
executing #MPI = 16 and #OMP = 2
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
=================================================
MUMPS compiled with option -Dmetis
MUMPS compiled with option -Dpord
MUMPS compiled with option -Dptscotch
MUMPS compiled with option -Dscotch
=================================================
L D L^T Solver for general symmetric matrices
Type of parallelism: Working host
****** ANALYSIS STEP ********
Processing a graph of size: 1000000
Average density of rows/columns = 4
Ordering based on METIS
ELAPSED TIME SPENT IN METIS reordering = 4.5618
SYMBOLIC based on column counts
ELAPSED TIME IN symbolic factorization = 0.1157
A root of estimated size 1486 has been selected for Scalapack.
Leaving analysis phase with ...
INFOG(1) = 0
INFOG(2) = 0
-- (20) Number of entries in factors (estim.) = 48473287
-- (3) Real space for factors (estimated) = 57463229
-- (4) Integer space for factors (estimated) = 6504980
-- (5) Maximum frontal size (estimated) = 1666
-- (6) Number of nodes in the tree = 66031
-- (32) Type of analysis effectively used = 1
-- (7) Ordering option effectively used = 5
ICNTL (6) Maximum transversal option = 0
ICNTL (7) Pivot order option = 7
ICNTL(12) Ordering symmetric indef. matrices = 1
ICNTL(13) Parallelism/splitting of root node = 0
ICNTL(14) Percentage of memory relaxation = 30
ICNTL(15) Analysis by block effectively used = 0
ICNTL(18) Distributed input matrix (on if >0) = 0
ICNTL(32) Forward elimination during facto. = 0
ICNTL(35) BLR activation = 0
ICNTL(48) Tree based multithreading (effective)= 1
ICNTL(58) Symbolic factorization option = 2
Number of level 2 nodes = 19
Number of split nodes = 0
RINFOG(1) Operations during elimination (estim)= 1.519D+10
MEMORY ESTIMATIONS ...
Estimations with standard Full-Rank (FR) factorization:
Maximum estim. space in Mbytes, IC facto. (INFOG(16)): 114
Total space in MBytes, IC factorization (INFOG(17)): 1349
Maximum estim. space in Mbytes, OOC facto. (INFOG(26)): 88
Total space in MBytes, OOC factorization (INFOG(27)): 961
Elapsed time in analysis driver= 5.0273
Analysis time by clock_gettime(): 5.028 s
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 7 1000000 2998000
executing #MPI = 16 and #OMP = 2
Elapsed time in save structure driver= 0.0005
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
On return from DMUMPS, INFOG(1)= -71
On return from DMUMPS, INFOG(2)= 0
PRE FACTO START LPROF----------------------
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 2 1000000 2998000
executing #MPI = 16 and #OMP = 2
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
* [MAQAO] Info: STARTING COUNTERS (igk-0805)
[0m
****** FACTORIZATION STEP ********
GLOBAL STATISTICS PRIOR NUMERICAL FACTORIZATION ...
Number of working processes = 16
ICNTL(22) Out-of-core option = 0
ICNTL(35) BLR activation (eff. choice) = 0
ICNTL(37) BLR CB compression (eff. choice) = 0
ICNTL(49) Compact workarray S (end facto.) = 0
ICNTL(56) Effective value during facto. = 0
ICNTL(14) Memory relaxation = 30
INFOG(3) Real space for factors (estimated)= 57463229
INFOG(4) Integer space for factors (estim.)= 6504980
Maximum frontal size (estimated) = 1666
Number of nodes in the tree = 66031
ICNTL(23) Memory allowed (value on host) = 0
Sum over all procs = 0
Memory provided by user, sum of LWK_USER = 0
Effective threshold for pivoting, CNTL(1) = 0.1000D-01
Statistics on the scaling phase
Elapsed time for scaling = 0.0465
Max difference from 1 after scaling the entries for ONE-NORM (option 7/8) = 0.81D-02
Average Effective size of S (based on INFO(39))= 3318203
Elapsed time to reformat/distribute matrix = 0.0790
Allocated buffers
------------------
Size of reception buffer in bytes ...... = 980140
Size of async. emission buffer (bytes).. = 3930361
Small emission buffer (bytes) .......... = 6572
** Memory allocated, max in Mbytes (INFOG(18)): 116
** Memory allocated, total in Mbytes (INFOG(19)): 1349
** Memory effectively used, max in Mbytes (INFOG(21)): 110
** Memory effectively used, total in Mbytes (INFOG(22)): 1221
Flops under L0 layer (avg/max across MPI) = 4.421D+07 5.373D+07
Elapsed time under L0 (avg/max across MPI) = 0.0518 0.0592
Elapsed time to process root node = 0.0284
Elapsed time for factorization = 0.2104
Leaving factorization with ...
RINFOG (2) Operations in node assembly = 5.080D+07
------ (3) Operations in node elimination = 1.519D+10
ICNTL (8) Scaling effectively used = 7
INFOG (9) Real space for factors = 57472256
INFOG (10) Integer space for factors = 6496943
INFOG (11) Maximum front size = 1666
INFOG (29) Number of entries in factors = 48473287
INFOG (12) Number of negative pivots = 0
INFOG (13) Number of delayed pivots = 0
Number of 2x2 pivots in type 1 nodes = 0
Number of 2x2 pivots in type 2 nodes = 0
RINFOG(19) Smallest pivot WITH perturbed pivots = 4.182D-12
RINFOG(20) Smallest pivot WITHOUT perturbed pivots = 4.182D-12
RINFOG(21) Largest pivot in absolute value = 6.207D-01
INFOG (24) Effective value of ICNTL(12) = 1
INFOG (14) Number of memory compress = 4
Elapsed time in factorization driver = 0.3531
Factorization time by clock_gettime(): 0.3584 s
Entering DMUMPS 5.8.2 from C interface with JOB = -2
executing #MPI = 16 and #OMP = 2
Your experiment path is /home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_4
To display your profiling results:
###############################################################################################################################################################
# LEVEL | REPORT | COMMAND #
###############################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_4 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_4 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_4 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_4 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_4 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_4 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_4 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_4 #
###############################################################################################################################################################
* [MAQAO] Info: Detected 32 Lprof instances in igk-0805.
If this is incorrect, rerun with number-processes-per-node=X
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 1 1000000 2998000
executing #MPI = 32 and #OMP = 2
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
=================================================
MUMPS compiled with option -Dmetis
MUMPS compiled with option -Dpord
MUMPS compiled with option -Dptscotch
MUMPS compiled with option -Dscotch
=================================================
L D L^T Solver for general symmetric matrices
Type of parallelism: Working host
****** ANALYSIS STEP ********
Processing a graph of size: 1000000
Average density of rows/columns = 4
Ordering based on METIS
ELAPSED TIME SPENT IN METIS reordering = 4.6837
SYMBOLIC based on column counts
ELAPSED TIME IN symbolic factorization = 0.1416
A root of estimated size 1486 has been selected for Scalapack.
Leaving analysis phase with ...
INFOG(1) = 0
INFOG(2) = 0
-- (20) Number of entries in factors (estim.) = 48473287
-- (3) Real space for factors (estimated) = 57447357
-- (4) Integer space for factors (estimated) = 6575840
-- (5) Maximum frontal size (estimated) = 1666
-- (6) Number of nodes in the tree = 66031
-- (32) Type of analysis effectively used = 1
-- (7) Ordering option effectively used = 5
ICNTL (6) Maximum transversal option = 0
ICNTL (7) Pivot order option = 7
ICNTL(12) Ordering symmetric indef. matrices = 1
ICNTL(13) Parallelism/splitting of root node = 0
ICNTL(14) Percentage of memory relaxation = 30
ICNTL(15) Analysis by block effectively used = 0
ICNTL(18) Distributed input matrix (on if >0) = 0
ICNTL(32) Forward elimination during facto. = 0
ICNTL(35) BLR activation = 0
ICNTL(48) Tree based multithreading (effective)= 1
ICNTL(58) Symbolic factorization option = 2
Number of level 2 nodes = 41
Number of split nodes = 0
RINFOG(1) Operations during elimination (estim)= 1.519D+10
MEMORY ESTIMATIONS ...
Estimations with standard Full-Rank (FR) factorization:
Maximum estim. space in Mbytes, IC facto. (INFOG(16)): 89
Total space in MBytes, IC factorization (INFOG(17)): 1890
Maximum estim. space in Mbytes, OOC facto. (INFOG(26)): 78
Total space in MBytes, OOC factorization (INFOG(27)): 1562
Elapsed time in analysis driver= 5.2515
Analysis time by clock_gettime(): 5.252 s
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 7 1000000 2998000
executing #MPI = 32 and #OMP = 2
Elapsed time in save structure driver= 0.0011
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
On return from DMUMPS, INFOG(1)= -71
On return from DMUMPS, INFOG(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 2 1000000 2998000
executing #MPI = 32 and #OMP = 2
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
* [MAQAO] Info: STARTING COUNTERS (igk-0805)
[0m
****** FACTORIZATION STEP ********
GLOBAL STATISTICS PRIOR NUMERICAL FACTORIZATION ...
Number of working processes = 32
ICNTL(22) Out-of-core option = 0
ICNTL(35) BLR activation (eff. choice) = 0
ICNTL(37) BLR CB compression (eff. choice) = 0
ICNTL(49) Compact workarray S (end facto.) = 0
ICNTL(56) Effective value during facto. = 0
ICNTL(14) Memory relaxation = 30
INFOG(3) Real space for factors (estimated)= 57447357
INFOG(4) Integer space for factors (estim.)= 6575840
Maximum frontal size (estimated) = 1666
Number of nodes in the tree = 66031
ICNTL(23) Memory allowed (value on host) = 0
Sum over all procs = 0
Memory provided by user, sum of LWK_USER = 0
Effective threshold for pivoting, CNTL(1) = 0.1000D-01
Statistics on the scaling phase
Elapsed time for scaling = 0.0537
Max difference from 1 after scaling the entries for ONE-NORM (option 7/8) = 0.81D-02
Average Effective size of S (based on INFO(39))= 1713006
Elapsed time to reformat/distribute matrix = 0.0734
Allocated buffers
------------------
Size of reception buffer in bytes ...... = 1218000
Size of async. emission buffer (bytes).. = 4884168
Small emission buffer (bytes) .......... = 23396
** Memory allocated, max in Mbytes (INFOG(18)): 90
** Memory allocated, total in Mbytes (INFOG(19)): 1882
** Memory effectively used, max in Mbytes (INFOG(21)): 87
** Memory effectively used, total in Mbytes (INFOG(22)): 1752
Flops under L0 layer (avg/max across MPI) = 2.360D+07 5.715D+07
Elapsed time under L0 (avg/max across MPI) = 0.0310 0.0359
Elapsed time to process root node = 0.0287
Elapsed time for factorization = 0.1611
Leaving factorization with ...
RINFOG (2) Operations in node assembly = 5.064D+07
------ (3) Operations in node elimination = 1.519D+10
ICNTL (8) Scaling effectively used = 7
INFOG (9) Real space for factors = 57472256
INFOG (10) Integer space for factors = 6552339
INFOG (11) Maximum front size = 1666
INFOG (29) Number of entries in factors = 48473287
INFOG (12) Number of negative pivots = 0
INFOG (13) Number of delayed pivots = 0
Number of 2x2 pivots in type 1 nodes = 0
Number of 2x2 pivots in type 2 nodes = 0
RINFOG(19) Smallest pivot WITH perturbed pivots = 4.180D-12
RINFOG(20) Smallest pivot WITHOUT perturbed pivots = 4.180D-12
RINFOG(21) Largest pivot in absolute value = 6.207D-01
INFOG (24) Effective value of ICNTL(12) = 1
INFOG (14) Number of memory compress = 16
Elapsed time in factorization driver = 0.3057
Factorization time by clock_gettime(): 0.3181 s
Entering DMUMPS 5.8.2 from C interface with JOB = -2
executing #MPI = 32 and #OMP = 2
Your experiment path is /home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_5
To display your profiling results:
###############################################################################################################################################################
# LEVEL | REPORT | COMMAND #
###############################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_5 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_5 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_5 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_5 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_5 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_5 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_5 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_5 #
###############################################################################################################################################################
* [MAQAO] Info: Detected 64 Lprof instances in igk-0805.
If this is incorrect, rerun with number-processes-per-node=X
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 1 1000000 2998000
executing #MPI = 64 and #OMP = 2
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
=================================================
MUMPS compiled with option -Dmetis
MUMPS compiled with option -Dpord
MUMPS compiled with option -Dptscotch
MUMPS compiled with option -Dscotch
=================================================
L D L^T Solver for general symmetric matrices
Type of parallelism: Working host
****** ANALYSIS STEP ********
Processing a graph of size: 1000000
Average density of rows/columns = 4
Ordering based on METIS
ELAPSED TIME SPENT IN METIS reordering = 4.6832
SYMBOLIC based on column counts
ELAPSED TIME IN symbolic factorization = 0.1418
A root of estimated size 1486 has been selected for Scalapack.
Leaving analysis phase with ...
INFOG(1) = 0
INFOG(2) = 0
-- (20) Number of entries in factors (estim.) = 48473287
-- (3) Real space for factors (estimated) = 57422709
-- (4) Integer space for factors (estimated) = 6663283
-- (5) Maximum frontal size (estimated) = 1666
-- (6) Number of nodes in the tree = 66035
-- (32) Type of analysis effectively used = 1
-- (7) Ordering option effectively used = 5
ICNTL (6) Maximum transversal option = 0
ICNTL (7) Pivot order option = 7
ICNTL(12) Ordering symmetric indef. matrices = 1
ICNTL(13) Parallelism/splitting of root node = 0
ICNTL(14) Percentage of memory relaxation = 30
ICNTL(15) Analysis by block effectively used = 0
ICNTL(18) Distributed input matrix (on if >0) = 0
ICNTL(32) Forward elimination during facto. = 0
ICNTL(35) BLR activation = 0
ICNTL(48) Tree based multithreading (effective)= 1
ICNTL(58) Symbolic factorization option = 2
Number of level 2 nodes = 72
Number of split nodes = 4
RINFOG(1) Operations during elimination (estim)= 1.519D+10
MEMORY ESTIMATIONS ...
Estimations with standard Full-Rank (FR) factorization:
Maximum estim. space in Mbytes, IC facto. (INFOG(16)): 119
Total space in MBytes, IC factorization (INFOG(17)): 2980
Maximum estim. space in Mbytes, OOC facto. (INFOG(26)): 116
Total space in MBytes, OOC factorization (INFOG(27)): 2748
Elapsed time in analysis driver= 5.2672
Analysis time by clock_gettime(): 5.268 s
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 7 1000000 2998000
executing #MPI = 64 and #OMP = 2
Elapsed time in save structure driver= 0.0012
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
On return from DMUMPS, INFOG(1)= -71
On return from DMUMPS, INFOG(2)= 0
PRE FACTO START LPROF----------------------
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 2 1000000 2998000
executing #MPI = 64 and #OMP = 2
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
* [MAQAO] Info: STARTING COUNTERS (igk-0805)
[0m ** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
****** FACTORIZATION STEP ********
GLOBAL STATISTICS PRIOR NUMERICAL FACTORIZATION ...
Number of working processes = 64
ICNTL(22) Out-of-core option = 0
ICNTL(35) BLR activation (eff. choice) = 0
ICNTL(37) BLR CB compression (eff. choice) = 0
ICNTL(49) Compact workarray S (end facto.) = 0
ICNTL(56) Effective value during facto. = 0
ICNTL(14) Memory relaxation = 30
INFOG(3) Real space for factors (estimated)= 57422709
INFOG(4) Integer space for factors (estim.)= 6663283
Maximum frontal size (estimated) = 1666
Number of nodes in the tree = 66035
ICNTL(23) Memory allowed (value on host) = 0
Sum over all procs = 0
Memory provided by user, sum of LWK_USER = 0
Effective threshold for pivoting, CNTL(1) = 0.1000D-01
Statistics on the scaling phase
Elapsed time for scaling = 0.0498
Max difference from 1 after scaling the entries for ONE-NORM (option 7/8) = 0.81D-02
Average Effective size of S (based on INFO(39))= 991454
Elapsed time to reformat/distribute matrix = 0.0917
Allocated buffers
------------------
Size of reception buffer in bytes ...... = 980804
Size of async. emission buffer (bytes).. = 3933018
Small emission buffer (bytes) .......... = 87504
** Memory allocated, max in Mbytes (INFOG(18)): 122
** Memory allocated, total in Mbytes (INFOG(19)): 2935
** Memory effectively used, max in Mbytes (INFOG(21)): 121
** Memory effectively used, total in Mbytes (INFOG(22)): 2751
Flops under L0 layer (avg/max across MPI) = 1.196D+07 2.376D+07
Elapsed time under L0 (avg/max across MPI) = 0.0254 0.0304
Elapsed time to process root node = 0.0346
Elapsed time for factorization = 0.1352
Leaving factorization with ...
RINFOG (2) Operations in node assembly = 5.381D+07
------ (3) Operations in node elimination = 1.519D+10
ICNTL (8) Scaling effectively used = 7
INFOG (9) Real space for factors = 57458376
INFOG (10) Integer space for factors = 6618910
INFOG (11) Maximum front size = 1666
INFOG (29) Number of entries in factors = 48473287
INFOG (12) Number of negative pivots = 0
INFOG (13) Number of delayed pivots = 0
Number of 2x2 pivots in type 1 nodes = 0
Number of 2x2 pivots in type 2 nodes = 0
RINFOG(19) Smallest pivot WITH perturbed pivots = 4.183D-12
RINFOG(20) Smallest pivot WITHOUT perturbed pivots = 4.183D-12
RINFOG(21) Largest pivot in absolute value = 6.207D-01
INFOG (24) Effective value of ICNTL(12) = 1
INFOG (14) Number of memory compress = 20
Elapsed time in factorization driver = 0.2938
Factorization time by clock_gettime(): 0.3068 s
Entering DMUMPS 5.8.2 from C interface with JOB = -2
executing #MPI = 64 and #OMP = 2
Your experiment path is /home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_6
To display your profiling results:
###############################################################################################################################################################
# LEVEL | REPORT | COMMAND #
###############################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_6 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_6 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_6 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_6 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_6 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_6 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_6 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_6 #
###############################################################################################################################################################
* [MAQAO] Info: Detected 86 Lprof instances in igk-0805.
If this is incorrect, rerun with number-processes-per-node=X
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 1 1000000 2998000
executing #MPI = 86 and #OMP = 2
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
=================================================
MUMPS compiled with option -Dmetis
MUMPS compiled with option -Dpord
MUMPS compiled with option -Dptscotch
MUMPS compiled with option -Dscotch
=================================================
L D L^T Solver for general symmetric matrices
Type of parallelism: Working host
****** ANALYSIS STEP ********
Processing a graph of size: 1000000
Average density of rows/columns = 4
Ordering based on METIS
ELAPSED TIME SPENT IN METIS reordering = 4.6741
SYMBOLIC based on column counts
ELAPSED TIME IN symbolic factorization = 0.1349
A root of estimated size 1486 has been selected for Scalapack.
Leaving analysis phase with ...
INFOG(1) = 0
INFOG(2) = 0
-- (20) Number of entries in factors (estim.) = 48473287
-- (3) Real space for factors (estimated) = 57408457
-- (4) Integer space for factors (estimated) = 6710218
-- (5) Maximum frontal size (estimated) = 1666
-- (6) Number of nodes in the tree = 66036
-- (32) Type of analysis effectively used = 1
-- (7) Ordering option effectively used = 5
ICNTL (6) Maximum transversal option = 0
ICNTL (7) Pivot order option = 7
ICNTL(12) Ordering symmetric indef. matrices = 1
ICNTL(13) Parallelism/splitting of root node = 0
ICNTL(14) Percentage of memory relaxation = 30
ICNTL(15) Analysis by block effectively used = 0
ICNTL(18) Distributed input matrix (on if >0) = 0
ICNTL(32) Forward elimination during facto. = 0
ICNTL(35) BLR activation = 0
ICNTL(48) Tree based multithreading (effective)= 1
ICNTL(58) Symbolic factorization option = 2
Number of level 2 nodes = 75
Number of split nodes = 5
RINFOG(1) Operations during elimination (estim)= 1.519D+10
MEMORY ESTIMATIONS ...
Estimations with standard Full-Rank (FR) factorization:
Maximum estim. space in Mbytes, IC facto. (INFOG(16)): 92
Total space in MBytes, IC factorization (INFOG(17)): 3706
Maximum estim. space in Mbytes, OOC facto. (INFOG(26)): 89
Total space in MBytes, OOC factorization (INFOG(27)): 3537
Elapsed time in analysis driver= 5.2518
Analysis time by clock_gettime(): 5.253 s
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 7 1000000 2998000
executing #MPI = 86 and #OMP = 2
Elapsed time in save structure driver= 0.0017
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
On return from DMUMPS, INFOG(1)= -71
On return from DMUMPS, INFOG(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 2 1000000 2998000
executing #MPI = 86 and #OMP = 2
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
* [MAQAO] Info: STARTING COUNTERS (igk-0805)
[0m ** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
****** FACTORIZATION STEP ********
GLOBAL STATISTICS PRIOR NUMERICAL FACTORIZATION ...
Number of working processes = 86
ICNTL(22) Out-of-core option = 0
ICNTL(35) BLR activation (eff. choice) = 0
ICNTL(37) BLR CB compression (eff. choice) = 0
ICNTL(49) Compact workarray S (end facto.) = 0
ICNTL(56) Effective value during facto. = 0
ICNTL(14) Memory relaxation = 30
INFOG(3) Real space for factors (estimated)= 57408457
INFOG(4) Integer space for factors (estim.)= 6710218
Maximum frontal size (estimated) = 1666
Number of nodes in the tree = 66036
ICNTL(23) Memory allowed (value on host) = 0
Sum over all procs = 0
Memory provided by user, sum of LWK_USER = 0
Effective threshold for pivoting, CNTL(1) = 0.1000D-01
Statistics on the scaling phase
Elapsed time for scaling = 0.0492
Max difference from 1 after scaling the entries for ONE-NORM (option 7/8) = 0.81D-02
Average Effective size of S (based on INFO(39))= 754328
Elapsed time to reformat/distribute matrix = 0.1041
Allocated buffers
------------------
Size of reception buffer in bytes ...... = 981108
Size of async. emission buffer (bytes).. = 3934236
Small emission buffer (bytes) .......... = 154972
** Memory allocated, max in Mbytes (INFOG(18)): 95
** Memory allocated, total in Mbytes (INFOG(19)): 3609
** Memory effectively used, max in Mbytes (INFOG(21)): 94
** Memory effectively used, total in Mbytes (INFOG(22)): 3460
Flops under L0 layer (avg/max across MPI) = 9.667D+06 1.790D+07
Elapsed time under L0 (avg/max across MPI) = 0.0278 0.0318
Elapsed time to process root node = 0.0362
Elapsed time for factorization = 0.1435
Leaving factorization with ...
RINFOG (2) Operations in node assembly = 5.466D+07
------ (3) Operations in node elimination = 1.519D+10
ICNTL (8) Scaling effectively used = 7
INFOG (9) Real space for factors = 57458346
INFOG (10) Integer space for factors = 6654079
INFOG (11) Maximum front size = 1666
INFOG (29) Number of entries in factors = 48473287
INFOG (12) Number of negative pivots = 0
INFOG (13) Number of delayed pivots = 0
Number of 2x2 pivots in type 1 nodes = 0
Number of 2x2 pivots in type 2 nodes = 0
RINFOG(19) Smallest pivot WITH perturbed pivots = 4.179D-12
RINFOG(20) Smallest pivot WITHOUT perturbed pivots = 4.179D-12
RINFOG(21) Largest pivot in absolute value = 6.207D-01
INFOG (24) Effective value of ICNTL(12) = 1
INFOG (14) Number of memory compress = 26
Elapsed time in factorization driver = 0.3258
Factorization time by clock_gettime(): 0.3445 s
Entering DMUMPS 5.8.2 from C interface with JOB = -2
executing #MPI = 86 and #OMP = 2
Your experiment path is /home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_7
To display your profiling results:
###############################################################################################################################################################
# LEVEL | REPORT | COMMAND #
###############################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_7 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_7 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_7 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_7 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_7 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_7 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_7 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_ecology_allowextra_scala_kptr_probe/tools/lprof_run_7 #
###############################################################################################################################################################