* [MAQAO] Info: Detected 1 Lprof instances in igk-0805.
If this is incorrect, rerun with number-processes-per-node=X
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 1 1489752 10319760
executing #MPI = 1 and #OMP = 2
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
=================================================
MUMPS compiled with option -Dmetis
MUMPS compiled with option -Dpord
MUMPS compiled with option -Dptscotch
MUMPS compiled with option -Dscotch
=================================================
L U Solver for unsymmetric matrices
Type of parallelism: Working host
****** ANALYSIS STEP ********
Processing a graph of size: 1489752
... Structural symmetry (in percent)= 100
Average density of rows/columns = 6
... No column permutation
Ordering based on METIS
ELAPSED TIME SPENT IN METIS reordering = 12.8636
SYMBOLIC based on column counts
ELAPSED TIME IN symbolic factorization = 0.3336
Leaving analysis phase with ...
INFOG(1) = 0
INFOG(2) = 0
-- (20) Number of entries in factors (estim.) = 2059669522
-- (3) Real space for factors (estimated) = 2059669522
-- (4) Integer space for factors (estimated) = 18147756
-- (5) Maximum frontal size (estimated) = 11342
-- (6) Number of nodes in the tree = 136820
-- (32) Type of analysis effectively used = 1
-- (7) Ordering option effectively used = 5
ICNTL (6) Maximum transversal option = 0
ICNTL (7) Pivot order option = 7
ICNTL(12) Ordering symmetric indef. matrices = 1
ICNTL(13) Parallelism/splitting of root node = 0
ICNTL(14) Percentage of memory relaxation = 30
ICNTL(15) Analysis by block effectively used = 0
ICNTL(18) Distributed input matrix (on if >0) = 0
ICNTL(32) Forward elimination during facto. = 0
ICNTL(35) BLR activation = 0
ICNTL(48) Tree based multithreading (effective)= 1
ICNTL(58) Symbolic factorization option = 2
Number of level 2 nodes = 0
Number of split nodes = 0
RINFOG(1) Operations during elimination (estim)= 1.088D+13
MEMORY ESTIMATIONS ...
Estimations with standard Full-Rank (FR) factorization:
Total space in MBytes, IC factorization (INFOG(17)): 26775
Total space in MBytes, OOC factorization (INFOG(27)): 10984
Elapsed time in analysis driver= 13.9191
Analysis time by clock_gettime(): 13.919 s
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 7 1489752 10319760
executing #MPI = 1 and #OMP = 2
Elapsed time in save structure driver= 0.0003
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
On return from DMUMPS, INFOG(1)= -71
On return from DMUMPS, INFOG(2)= 0
PRE FACTO START LPROF----------------------
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 2 1489752 10319760
executing #MPI = 1 and #OMP = 2
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
****** FACTORIZATION STEP ********
GLOBAL STATISTICS PRIOR NUMERICAL FACTORIZATION ...
Number of working processes = 1
ICNTL(22) Out-of-core option = 0
ICNTL(35) BLR activation (eff. choice) = 0
ICNTL(37) BLR CB compression (eff. choice) = 0
ICNTL(49) Compact workarray S (end facto.) = 0
ICNTL(56) Effective value during facto. = 0
ICNTL(14) Memory relaxation = 30
INFOG(3) Real space for factors (estimated)= 2059669522
INFOG(4) Integer space for factors (estim.)= 18147756
Maximum frontal size (estimated) = 11342
Number of nodes in the tree = 136820
ICNTL(23) Memory allowed (value on host) = 0
Sum over all procs = 0
Memory provided by user, sum of LWK_USER = 0
Effective threshold for pivoting, CNTL(1) = 0.1000D-01
* [MAQAO] Info: STARTING COUNTERS (igk-0805)
[0m
Statistics on the scaling phase
Elapsed time for scaling = 0.1545
Max difference from 1 after scaling the entries for ONE-NORM (option 7/8) = 0.59D-02
Effective size of S (based on INFO(39))= 1487536125
Redistrib: total data local/sent = 0 0
Elapsed time to reformat/distribute matrix = 0.3041
Allocated buffers
------------------
Size of reception buffer in bytes ...... = 230000
Size of async. emission buffer (bytes).. = 314275
Small emission buffer (bytes) .......... = 20
** Memory allocated, total in Mbytes (INFOG(19)): 25801
** Memory effectively used, total in Mbytes (INFOG(22)): 23055
Flops under L0 layer = 2.521D+12
Elapsed time under L0 = 30.3575
Elapsed time for factorization = 140.5157
Leaving factorization with ...
RINFOG (2) Operations in node assembly = 4.948D+09
------ (3) Operations in node elimination = 1.088D+13
ICNTL (8) Scaling effectively used = 7
INFOG (9) Real space for factors = 2059669522
INFOG (10) Integer space for factors = 18147756
INFOG (11) Maximum front size = 11342
INFOG (29) Number of entries in factors = 2059669522
INFOG (12) Number of off diagonal pivots = 0
INFOG (13) Number of delayed pivots = 0
RINFOG(19) Smallest pivot WITH perturbed pivots = 3.245D-01
RINFOG(20) Smallest pivot WITHOUT perturbed pivots = 3.245D-01
RINFOG(21) Largest pivot in absolute value = 6.699D-01
INFOG (24) Effective value of ICNTL(12) = 1
INFOG (14) Number of memory compress = 0
Elapsed time in factorization driver = 141.0058
Factorization time by clock_gettime(): 141.0011 s
Entering DMUMPS 5.8.2 from C interface with JOB = -2
executing #MPI = 1 and #OMP = 2
Your experiment path is /home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_0
To display your profiling results:
###########################################################################################################################################################
# LEVEL | REPORT | COMMAND #
###########################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_0 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_0 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_0 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_0 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_0 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_0 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_0 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_0 #
###########################################################################################################################################################
* [MAQAO] Info: Detected 2 Lprof instances in igk-0805.
If this is incorrect, rerun with number-processes-per-node=X
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 1 1489752 10319760
executing #MPI = 2 and #OMP = 2
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
=================================================
MUMPS compiled with option -Dmetis
MUMPS compiled with option -Dpord
MUMPS compiled with option -Dptscotch
MUMPS compiled with option -Dscotch
=================================================
L U Solver for unsymmetric matrices
Type of parallelism: Working host
****** ANALYSIS STEP ********
Processing a graph of size: 1489752
... Structural symmetry (in percent)= 100
Average density of rows/columns = 6
... No column permutation
Ordering based on METIS
ELAPSED TIME SPENT IN METIS reordering = 12.7645
SYMBOLIC based on column counts
ELAPSED TIME IN symbolic factorization = 0.3275
A root of estimated size 7524 has been selected for Scalapack.
Leaving analysis phase with ...
INFOG(1) = 0
INFOG(2) = 0
-- (20) Number of entries in factors (estim.) = 2059669522
-- (3) Real space for factors (estimated) = 2059669522
-- (4) Integer space for factors (estimated) = 18175742
-- (5) Maximum frontal size (estimated) = 11342
-- (6) Number of nodes in the tree = 136820
-- (32) Type of analysis effectively used = 1
-- (7) Ordering option effectively used = 5
ICNTL (6) Maximum transversal option = 0
ICNTL (7) Pivot order option = 7
ICNTL(12) Ordering symmetric indef. matrices = 1
ICNTL(13) Parallelism/splitting of root node = 0
ICNTL(14) Percentage of memory relaxation = 30
ICNTL(15) Analysis by block effectively used = 0
ICNTL(18) Distributed input matrix (on if >0) = 0
ICNTL(32) Forward elimination during facto. = 0
ICNTL(35) BLR activation = 0
ICNTL(48) Tree based multithreading (effective)= 1
ICNTL(58) Symbolic factorization option = 2
Number of level 2 nodes = 3
Number of split nodes = 0
RINFOG(1) Operations during elimination (estim)= 1.088D+13
MEMORY ESTIMATIONS ...
Estimations with standard Full-Rank (FR) factorization:
Maximum estim. space in Mbytes, IC facto. (INFOG(16)): 14107
Total space in MBytes, IC factorization (INFOG(17)): 28178
Maximum estim. space in Mbytes, OOC facto. (INFOG(26)): 5879
Total space in MBytes, OOC factorization (INFOG(27)): 11653
Elapsed time in analysis driver= 14.0621
Analysis time by clock_gettime(): 14.062 s
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 7 1489752 10319760
executing #MPI = 2 and #OMP = 2
Elapsed time in save structure driver= 0.0004
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
On return from DMUMPS, INFOG(1)= -71
On return from DMUMPS, INFOG(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 2 1489752 10319760
executing #MPI = 2 and #OMP = 2
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
****** FACTORIZATION STEP ********
GLOBAL STATISTICS PRIOR NUMERICAL FACTORIZATION ...
Number of working processes = 2
ICNTL(22) Out-of-core option = 0
ICNTL(35) BLR activation (eff. choice) = 0
ICNTL(37) BLR CB compression (eff. choice) = 0
ICNTL(49) Compact workarray S (end facto.) = 0
ICNTL(56) Effective value during facto. = 0
ICNTL(14) Memory relaxation = 30
INFOG(3) Real space for factors (estimated)= 2059669522
INFOG(4) Integer space for factors (estim.)= 18175742
Maximum frontal size (estimated) = 11342
Number of nodes in the tree = 136820
ICNTL(23) Memory allowed (value on host) = 0
Sum over all procs = 0
Memory provided by user, sum of LWK_USER = 0
Effective threshold for pivoting, CNTL(1) = 0.1000D-01
* [MAQAO] Info: STARTING COUNTERS (igk-0805)
[0m
Statistics on the scaling phase
Elapsed time for scaling = 0.1501
Max difference from 1 after scaling the entries for ONE-NORM (option 7/8) = 0.59D-02
Average Effective size of S (based on INFO(39))= 890518119
Elapsed time to reformat/distribute matrix = 0.2818
Allocated buffers
------------------
Size of reception buffer in bytes ...... = 10780468
Size of async. emission buffer (bytes).. = 32449208
Small emission buffer (bytes) .......... = 268
** Memory allocated, max in Mbytes (INFOG(18)): 14106
** Memory allocated, total in Mbytes (INFOG(19)): 28176
** Memory effectively used, max in Mbytes (INFOG(21)): 12780
** Memory effectively used, total in Mbytes (INFOG(22)): 24771
Flops under L0 layer (avg/max across MPI) = 1.224D+12 1.226D+12
Elapsed time under L0 (avg/max across MPI) = 15.1082 15.1361
Elapsed time to process root node = 2.1808
Elapsed time for factorization = 82.2513
Leaving factorization with ...
RINFOG (2) Operations in node assembly = 4.891D+09
------ (3) Operations in node elimination = 1.088D+13
ICNTL (8) Scaling effectively used = 7
INFOG (9) Real space for factors = 2059669522
INFOG (10) Integer space for factors = 18175784
INFOG (11) Maximum front size = 11342
INFOG (29) Number of entries in factors = 2059669522
INFOG (12) Number of off diagonal pivots = 0
INFOG (13) Number of delayed pivots = 0
RINFOG(19) Smallest pivot WITH perturbed pivots = 3.245D-01
RINFOG(20) Smallest pivot WITHOUT perturbed pivots = 3.245D-01
RINFOG(21) Largest pivot in absolute value = 6.699D-01
INFOG (24) Effective value of ICNTL(12) = 1
INFOG (14) Number of memory compress = 0
Elapsed time in factorization driver = 82.7155
Factorization time by clock_gettime(): 82.7133 s
Entering DMUMPS 5.8.2 from C interface with JOB = -2
executing #MPI = 2 and #OMP = 2
Your experiment path is /home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_1
To display your profiling results:
###########################################################################################################################################################
# LEVEL | REPORT | COMMAND #
###########################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_1 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_1 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_1 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_1 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_1 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_1 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_1 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_1 #
###########################################################################################################################################################
* [MAQAO] Info: Detected 4 Lprof instances in igk-0805.
If this is incorrect, rerun with number-processes-per-node=X
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 1 1489752 10319760
executing #MPI = 4 and #OMP = 2
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
=================================================
MUMPS compiled with option -Dmetis
MUMPS compiled with option -Dpord
MUMPS compiled with option -Dptscotch
MUMPS compiled with option -Dscotch
=================================================
L U Solver for unsymmetric matrices
Type of parallelism: Working host
****** ANALYSIS STEP ********
Processing a graph of size: 1489752
... Structural symmetry (in percent)= 100
Average density of rows/columns = 6
... No column permutation
Ordering based on METIS
ELAPSED TIME SPENT IN METIS reordering = 12.7845
SYMBOLIC based on column counts
ELAPSED TIME IN symbolic factorization = 0.3241
A root of estimated size 7524 has been selected for Scalapack.
Leaving analysis phase with ...
INFOG(1) = 0
INFOG(2) = 0
-- (20) Number of entries in factors (estim.) = 2059669522
-- (3) Real space for factors (estimated) = 2059698074
-- (4) Integer space for factors (estimated) = 18228294
-- (5) Maximum frontal size (estimated) = 11342
-- (6) Number of nodes in the tree = 136820
-- (32) Type of analysis effectively used = 1
-- (7) Ordering option effectively used = 5
ICNTL (6) Maximum transversal option = 0
ICNTL (7) Pivot order option = 7
ICNTL(12) Ordering symmetric indef. matrices = 1
ICNTL(13) Parallelism/splitting of root node = 0
ICNTL(14) Percentage of memory relaxation = 30
ICNTL(15) Analysis by block effectively used = 0
ICNTL(18) Distributed input matrix (on if >0) = 0
ICNTL(32) Forward elimination during facto. = 0
ICNTL(35) BLR activation = 0
ICNTL(48) Tree based multithreading (effective)= 1
ICNTL(58) Symbolic factorization option = 2
Number of level 2 nodes = 6
Number of split nodes = 0
RINFOG(1) Operations during elimination (estim)= 1.088D+13
MEMORY ESTIMATIONS ...
Estimations with standard Full-Rank (FR) factorization:
Maximum estim. space in Mbytes, IC facto. (INFOG(16)): 7715
Total space in MBytes, IC factorization (INFOG(17)): 29092
Maximum estim. space in Mbytes, OOC facto. (INFOG(26)): 3878
Total space in MBytes, OOC factorization (INFOG(27)): 14905
Elapsed time in analysis driver= 14.0833
Analysis time by clock_gettime(): 14.083 s
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 7 1489752 10319760
executing #MPI = 4 and #OMP = 2
Elapsed time in save structure driver= 0.0004
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
On return from DMUMPS, INFOG(1)= -71
On return from DMUMPS, INFOG(2)= 0
PRE FACTO START LPROF----------------------
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 2 1489752 10319760
executing #MPI = 4 and #OMP = 2
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
****** FACTORIZATION STEP ********
GLOBAL STATISTICS PRIOR NUMERICAL FACTORIZATION ...
Number of working processes = 4
ICNTL(22) Out-of-core option = 0
ICNTL(35) BLR activation (eff. choice) = 0
ICNTL(37) BLR CB compression (eff. choice) = 0
ICNTL(49) Compact workarray S (end facto.) = 0
ICNTL(56) Effective value during facto. = 0
ICNTL(14) Memory relaxation = 30
INFOG(3) Real space for factors (estimated)= 2059698074
INFOG(4) Integer space for factors (estim.)= 18228294
Maximum frontal size (estimated) = 11342
Number of nodes in the tree = 136820
ICNTL(23) Memory allowed (value on host) = 0
Sum over all procs = 0
Memory provided by user, sum of LWK_USER = 0
Effective threshold for pivoting, CNTL(1) = 0.1000D-01
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
* [MAQAO] Info: STARTING COUNTERS (igk-0805)
[0m ** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
Statistics on the scaling phase
Elapsed time for scaling = 0.1505
Max difference from 1 after scaling the entries for ONE-NORM (option 7/8) = 0.59D-02
Average Effective size of S (based on INFO(39))= 471463630
Elapsed time to reformat/distribute matrix = 0.2453
Allocated buffers
------------------
Size of reception buffer in bytes ...... = 10957604
Size of async. emission buffer (bytes).. = 32982389
Small emission buffer (bytes) .......... = 664
** Memory allocated, max in Mbytes (INFOG(18)): 7715
** Memory allocated, total in Mbytes (INFOG(19)): 29091
** Memory effectively used, max in Mbytes (INFOG(21)): 6433
** Memory effectively used, total in Mbytes (INFOG(22)): 25358
Flops under L0 layer (avg/max across MPI) = 5.958D+11 6.358D+11
Elapsed time under L0 (avg/max across MPI) = 7.3723 7.7595
Elapsed time to process root node = 1.2829
Elapsed time for factorization = 46.9645
Leaving factorization with ...
RINFOG (2) Operations in node assembly = 4.920D+09
------ (3) Operations in node elimination = 1.088D+13
ICNTL (8) Scaling effectively used = 7
INFOG (9) Real space for factors = 2059669522
INFOG (10) Integer space for factors = 18228536
INFOG (11) Maximum front size = 11342
INFOG (29) Number of entries in factors = 2059669522
INFOG (12) Number of off diagonal pivots = 0
INFOG (13) Number of delayed pivots = 0
RINFOG(19) Smallest pivot WITH perturbed pivots = 3.245D-01
RINFOG(20) Smallest pivot WITHOUT perturbed pivots = 3.245D-01
RINFOG(21) Largest pivot in absolute value = 6.699D-01
INFOG (24) Effective value of ICNTL(12) = 1
INFOG (14) Number of memory compress = 1
Elapsed time in factorization driver = 47.3808
Factorization time by clock_gettime(): 47.3906 s
Entering DMUMPS 5.8.2 from C interface with JOB = -2
executing #MPI = 4 and #OMP = 2
Your experiment path is /home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_2
To display your profiling results:
###########################################################################################################################################################
# LEVEL | REPORT | COMMAND #
###########################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_2 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_2 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_2 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_2 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_2 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_2 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_2 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_2 #
###########################################################################################################################################################
* [MAQAO] Info: Detected 8 Lprof instances in igk-0805.
If this is incorrect, rerun with number-processes-per-node=X
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 1 1489752 10319760
executing #MPI = 8 and #OMP = 2
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
=================================================
MUMPS compiled with option -Dmetis
MUMPS compiled with option -Dpord
MUMPS compiled with option -Dptscotch
MUMPS compiled with option -Dscotch
=================================================
L U Solver for unsymmetric matrices
Type of parallelism: Working host
****** ANALYSIS STEP ********
Processing a graph of size: 1489752
... Structural symmetry (in percent)= 100
Average density of rows/columns = 6
... No column permutation
Ordering based on METIS
ELAPSED TIME SPENT IN METIS reordering = 12.7871
SYMBOLIC based on column counts
ELAPSED TIME IN symbolic factorization = 0.3264
A root of estimated size 7524 has been selected for Scalapack.
Leaving analysis phase with ...
INFOG(1) = 0
INFOG(2) = 0
-- (20) Number of entries in factors (estim.) = 2059669522
-- (3) Real space for factors (estimated) = 2059728834
-- (4) Integer space for factors (estimated) = 18369951
-- (5) Maximum frontal size (estimated) = 11342
-- (6) Number of nodes in the tree = 136820
-- (32) Type of analysis effectively used = 1
-- (7) Ordering option effectively used = 5
ICNTL (6) Maximum transversal option = 0
ICNTL (7) Pivot order option = 7
ICNTL(12) Ordering symmetric indef. matrices = 1
ICNTL(13) Parallelism/splitting of root node = 0
ICNTL(14) Percentage of memory relaxation = 30
ICNTL(15) Analysis by block effectively used = 0
ICNTL(18) Distributed input matrix (on if >0) = 0
ICNTL(32) Forward elimination during facto. = 0
ICNTL(35) BLR activation = 0
ICNTL(48) Tree based multithreading (effective)= 1
ICNTL(58) Symbolic factorization option = 2
Number of level 2 nodes = 16
Number of split nodes = 0
RINFOG(1) Operations during elimination (estim)= 1.088D+13
MEMORY ESTIMATIONS ...
Estimations with standard Full-Rank (FR) factorization:
Maximum estim. space in Mbytes, IC facto. (INFOG(16)): 4455
Total space in MBytes, IC factorization (INFOG(17)): 32677
Maximum estim. space in Mbytes, OOC facto. (INFOG(26)): 2528
Total space in MBytes, OOC factorization (INFOG(27)): 17762
Elapsed time in analysis driver= 14.1068
Analysis time by clock_gettime(): 14.107 s
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 7 1489752 10319760
executing #MPI = 8 and #OMP = 2
Elapsed time in save structure driver= 0.0004
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
On return from DMUMPS, INFOG(1)= -71
On return from DMUMPS, INFOG(2)= 0
PRE FACTO START LPROF----------------------
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 2 1489752 10319760
executing #MPI = 8 and #OMP = 2
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
****** FACTORIZATION STEP ********
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
* [MAQAO] Info: STARTING COUNTERS (igk-0805)
[0m ** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
GLOBAL STATISTICS PRIOR NUMERICAL FACTORIZATION ...
Number of working processes = 8
ICNTL(22) Out-of-core option = 0
ICNTL(35) BLR activation (eff. choice) = 0
ICNTL(37) BLR CB compression (eff. choice) = 0
ICNTL(49) Compact workarray S (end facto.) = 0
ICNTL(56) Effective value during facto. = 0
ICNTL(14) Memory relaxation = 30
INFOG(3) Real space for factors (estimated)= 2059728834
INFOG(4) Integer space for factors (estim.)= 18369951
Maximum frontal size (estimated) = 11342
Number of nodes in the tree = 136820
ICNTL(23) Memory allowed (value on host) = 0
Sum over all procs = 0
Memory provided by user, sum of LWK_USER = 0
Effective threshold for pivoting, CNTL(1) = 0.1000D-01
Statistics on the scaling phase
Elapsed time for scaling = 0.1489
Max difference from 1 after scaling the entries for ONE-NORM (option 7/8) = 0.59D-02
Average Effective size of S (based on INFO(39))= 294459586
Elapsed time to reformat/distribute matrix = 0.2195
Allocated buffers
------------------
Size of reception buffer in bytes ...... = 10957652
Size of async. emission buffer (bytes).. = 32982527
Small emission buffer (bytes) .......... = 2016
** Memory allocated, max in Mbytes (INFOG(18)): 4455
** Memory allocated, total in Mbytes (INFOG(19)): 32675
** Memory effectively used, max in Mbytes (INFOG(21)): 3704
** Memory effectively used, total in Mbytes (INFOG(22)): 27623
Flops under L0 layer (avg/max across MPI) = 2.609D+11 3.022D+11
Elapsed time under L0 (avg/max across MPI) = 3.3299 3.8498
Elapsed time to process root node = 0.6825
Elapsed time for factorization = 30.7897
Leaving factorization with ...
RINFOG (2) Operations in node assembly = 4.934D+09
------ (3) Operations in node elimination = 1.088D+13
ICNTL (8) Scaling effectively used = 7
INFOG (9) Real space for factors = 2059669522
INFOG (10) Integer space for factors = 18370531
INFOG (11) Maximum front size = 11342
INFOG (29) Number of entries in factors = 2059669522
INFOG (12) Number of off diagonal pivots = 0
INFOG (13) Number of delayed pivots = 0
RINFOG(19) Smallest pivot WITH perturbed pivots = 3.245D-01
RINFOG(20) Smallest pivot WITHOUT perturbed pivots = 3.245D-01
RINFOG(21) Largest pivot in absolute value = 6.699D-01
INFOG (24) Effective value of ICNTL(12) = 1
INFOG (14) Number of memory compress = 8
Elapsed time in factorization driver = 31.2122
Factorization time by clock_gettime(): 31.2117 s
Entering DMUMPS 5.8.2 from C interface with JOB = -2
executing #MPI = 8 and #OMP = 2
Your experiment path is /home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_3
To display your profiling results:
###########################################################################################################################################################
# LEVEL | REPORT | COMMAND #
###########################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_3 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_3 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_3 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_3 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_3 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_3 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_3 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_3 #
###########################################################################################################################################################
* [MAQAO] Info: Detected 16 Lprof instances in igk-0805.
If this is incorrect, rerun with number-processes-per-node=X
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 1 1489752 10319760
executing #MPI = 16 and #OMP = 2
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
=================================================
MUMPS compiled with option -Dmetis
MUMPS compiled with option -Dpord
MUMPS compiled with option -Dptscotch
MUMPS compiled with option -Dscotch
=================================================
L U Solver for unsymmetric matrices
Type of parallelism: Working host
****** ANALYSIS STEP ********
Processing a graph of size: 1489752
... Structural symmetry (in percent)= 100
Average density of rows/columns = 6
... No column permutation
Ordering based on METIS
ELAPSED TIME SPENT IN METIS reordering = 12.8316
SYMBOLIC based on column counts
ELAPSED TIME IN symbolic factorization = 0.3281
A root of estimated size 7524 has been selected for Scalapack.
Leaving analysis phase with ...
INFOG(1) = 0
INFOG(2) = 0
-- (20) Number of entries in factors (estim.) = 2059669522
-- (3) Real space for factors (estimated) = 2059874252
-- (4) Integer space for factors (estimated) = 18631246
-- (5) Maximum frontal size (estimated) = 11342
-- (6) Number of nodes in the tree = 136822
-- (32) Type of analysis effectively used = 1
-- (7) Ordering option effectively used = 5
ICNTL (6) Maximum transversal option = 0
ICNTL (7) Pivot order option = 7
ICNTL(12) Ordering symmetric indef. matrices = 1
ICNTL(13) Parallelism/splitting of root node = 0
ICNTL(14) Percentage of memory relaxation = 30
ICNTL(15) Analysis by block effectively used = 0
ICNTL(18) Distributed input matrix (on if >0) = 0
ICNTL(32) Forward elimination during facto. = 0
ICNTL(35) BLR activation = 0
ICNTL(48) Tree based multithreading (effective)= 1
ICNTL(58) Symbolic factorization option = 2
Number of level 2 nodes = 34
Number of split nodes = 2
RINFOG(1) Operations during elimination (estim)= 1.088D+13
MEMORY ESTIMATIONS ...
Estimations with standard Full-Rank (FR) factorization:
Maximum estim. space in Mbytes, IC facto. (INFOG(16)): 2627
Total space in MBytes, IC factorization (INFOG(17)): 37596
Maximum estim. space in Mbytes, OOC facto. (INFOG(26)): 1628
Total space in MBytes, OOC factorization (INFOG(27)): 23297
Elapsed time in analysis driver= 14.1375
Analysis time by clock_gettime(): 14.137 s
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 7 1489752 10319760
executing #MPI = 16 and #OMP = 2
Elapsed time in save structure driver= 0.0005
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
On return from DMUMPS, INFOG(1)= -71
On return from DMUMPS, INFOG(2)= 0
PRE FACTO START LPROF----------------------
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 2 1489752 10319760
executing #MPI = 16 and #OMP = 2
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
* [MAQAO] Info: STARTING COUNTERS (igk-0805)
****** FACTORIZATION STEP ********
[0m GLOBAL STATISTICS PRIOR NUMERICAL FACTORIZATION ...
Number of working processes = 16
ICNTL(22) Out-of-core option = 0
ICNTL(35) BLR activation (eff. choice) = 0
ICNTL(37) BLR CB compression (eff. choice) = 0
ICNTL(49) Compact workarray S (end facto.) = 0
ICNTL(56) Effective value during facto. = 0
ICNTL(14) Memory relaxation = 30
INFOG(3) Real space for factors (estimated)= 2059874252
INFOG(4) Integer space for factors (estim.)= 18631246
Maximum frontal size (estimated) = 11342
Number of nodes in the tree = 136822
ICNTL(23) Memory allowed (value on host) = 0
Sum over all procs = 0
Memory provided by user, sum of LWK_USER = 0
Effective threshold for pivoting, CNTL(1) = 0.1000D-01
Statistics on the scaling phase
Elapsed time for scaling = 0.1528
Max difference from 1 after scaling the entries for ONE-NORM (option 7/8) = 0.59D-02
Average Effective size of S (based on INFO(39))= 194828749
Elapsed time to reformat/distribute matrix = 0.2290
Allocated buffers
------------------
Size of reception buffer in bytes ...... = 10957744
Size of async. emission buffer (bytes).. = 32982805
Small emission buffer (bytes) .......... = 6600
** Memory allocated, max in Mbytes (INFOG(18)): 2627
** Memory allocated, total in Mbytes (INFOG(19)): 37593
** Memory effectively used, max in Mbytes (INFOG(21)): 2001
** Memory effectively used, total in Mbytes (INFOG(22)): 29074
Flops under L0 layer (avg/max across MPI) = 6.724D+10 7.852D+10
Elapsed time under L0 (avg/max across MPI) = 1.2159 1.4424
Elapsed time to process root node = 0.4300
Elapsed time for factorization = 20.3216
Leaving factorization with ...
RINFOG (2) Operations in node assembly = 5.132D+09
------ (3) Operations in node elimination = 1.088D+13
ICNTL (8) Scaling effectively used = 7
INFOG (9) Real space for factors = 2059623706
INFOG (10) Integer space for factors = 18632413
INFOG (11) Maximum front size = 11342
INFOG (29) Number of entries in factors = 2059623706
INFOG (12) Number of off diagonal pivots = 0
INFOG (13) Number of delayed pivots = 0
RINFOG(19) Smallest pivot WITH perturbed pivots = 3.245D-01
RINFOG(20) Smallest pivot WITHOUT perturbed pivots = 3.245D-01
RINFOG(21) Largest pivot in absolute value = 6.699D-01
INFOG (24) Effective value of ICNTL(12) = 1
INFOG (14) Number of memory compress = 16
Elapsed time in factorization driver = 20.7369
Factorization time by clock_gettime(): 20.7376 s
Entering DMUMPS 5.8.2 from C interface with JOB = -2
executing #MPI = 16 and #OMP = 2
Your experiment path is /home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_4
To display your profiling results:
###########################################################################################################################################################
# LEVEL | REPORT | COMMAND #
###########################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_4 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_4 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_4 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_4 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_4 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_4 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_4 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_4 #
###########################################################################################################################################################
* [MAQAO] Info: Detected 32 Lprof instances in igk-0805.
If this is incorrect, rerun with number-processes-per-node=X
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 1 1489752 10319760
executing #MPI = 32 and #OMP = 2
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
=================================================
MUMPS compiled with option -Dmetis
MUMPS compiled with option -Dpord
MUMPS compiled with option -Dptscotch
MUMPS compiled with option -Dscotch
=================================================
L U Solver for unsymmetric matrices
Type of parallelism: Working host
****** ANALYSIS STEP ********
Processing a graph of size: 1489752
... Structural symmetry (in percent)= 100
Average density of rows/columns = 6
... No column permutation
Ordering based on METIS
ELAPSED TIME SPENT IN METIS reordering = 13.0792
SYMBOLIC based on column counts
ELAPSED TIME IN symbolic factorization = 0.3801
A root of estimated size 7524 has been selected for Scalapack.
Leaving analysis phase with ...
INFOG(1) = 0
INFOG(2) = 0
-- (20) Number of entries in factors (estim.) = 2059669522
-- (3) Real space for factors (estimated) = 2061742808
-- (4) Integer space for factors (estimated) = 19277274
-- (5) Maximum frontal size (estimated) = 11342
-- (6) Number of nodes in the tree = 136823
-- (32) Type of analysis effectively used = 1
-- (7) Ordering option effectively used = 5
ICNTL (6) Maximum transversal option = 0
ICNTL (7) Pivot order option = 7
ICNTL(12) Ordering symmetric indef. matrices = 1
ICNTL(13) Parallelism/splitting of root node = 0
ICNTL(14) Percentage of memory relaxation = 30
ICNTL(15) Analysis by block effectively used = 0
ICNTL(18) Distributed input matrix (on if >0) = 0
ICNTL(32) Forward elimination during facto. = 0
ICNTL(35) BLR activation = 0
ICNTL(48) Tree based multithreading (effective)= 1
ICNTL(58) Symbolic factorization option = 2
Number of level 2 nodes = 66
Number of split nodes = 3
RINFOG(1) Operations during elimination (estim)= 1.088D+13
MEMORY ESTIMATIONS ...
Estimations with standard Full-Rank (FR) factorization:
Maximum estim. space in Mbytes, IC facto. (INFOG(16)): 1296
Total space in MBytes, IC factorization (INFOG(17)): 36514
Maximum estim. space in Mbytes, OOC facto. (INFOG(26)): 973
Total space in MBytes, OOC factorization (INFOG(27)): 22840
Elapsed time in analysis driver= 14.6452
Analysis time by clock_gettime(): 14.645 s
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 7 1489752 10319760
executing #MPI = 32 and #OMP = 2
Elapsed time in save structure driver= 0.0009
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
On return from DMUMPS, INFOG(1)= -71
On return from DMUMPS, INFOG(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
* [MAQAO] Info: STARTING COUNTERS (igk-0805)
[0m ** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 2 1489752 10319760
executing #MPI = 32 and #OMP = 2
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
****** FACTORIZATION STEP ********
GLOBAL STATISTICS PRIOR NUMERICAL FACTORIZATION ...
Number of working processes = 32
ICNTL(22) Out-of-core option = 0
ICNTL(35) BLR activation (eff. choice) = 0
ICNTL(37) BLR CB compression (eff. choice) = 0
ICNTL(49) Compact workarray S (end facto.) = 0
ICNTL(56) Effective value during facto. = 0
ICNTL(14) Memory relaxation = 30
INFOG(3) Real space for factors (estimated)= 2061742808
INFOG(4) Integer space for factors (estim.)= 19277274
Maximum frontal size (estimated) = 11342
Number of nodes in the tree = 136823
ICNTL(23) Memory allowed (value on host) = 0
Sum over all procs = 0
Memory provided by user, sum of LWK_USER = 0
Effective threshold for pivoting, CNTL(1) = 0.1000D-01
Statistics on the scaling phase
Elapsed time for scaling = 0.1668
Max difference from 1 after scaling the entries for ONE-NORM (option 7/8) = 0.59D-02
Average Effective size of S (based on INFO(39))= 99267967
Elapsed time to reformat/distribute matrix = 0.2987
Allocated buffers
------------------
Size of reception buffer in bytes ...... = 6188140
Size of async. emission buffer (bytes).. = 18626302
Small emission buffer (bytes) .......... = 23368
** Memory allocated, max in Mbytes (INFOG(18)): 1296
** Memory allocated, total in Mbytes (INFOG(19)): 36506
** Memory effectively used, max in Mbytes (INFOG(21)): 1002
** Memory effectively used, total in Mbytes (INFOG(22)): 28263
Flops under L0 layer (avg/max across MPI) = 1.804D+10 2.657D+10
Elapsed time under L0 (avg/max across MPI) = 0.5202 0.6376
Elapsed time to process root node = 0.3060
Elapsed time for factorization = 14.5201
Leaving factorization with ...
RINFOG (2) Operations in node assembly = 5.231D+09
------ (3) Operations in node elimination = 1.088D+13
ICNTL (8) Scaling effectively used = 7
INFOG (9) Real space for factors = 2059626604
INFOG (10) Integer space for factors = 19281265
INFOG (11) Maximum front size = 11342
INFOG (29) Number of entries in factors = 2059626604
INFOG (12) Number of off diagonal pivots = 0
INFOG (13) Number of delayed pivots = 0
RINFOG(19) Smallest pivot WITH perturbed pivots = 3.245D-01
RINFOG(20) Smallest pivot WITHOUT perturbed pivots = 3.245D-01
RINFOG(21) Largest pivot in absolute value = 6.699D-01
INFOG (24) Effective value of ICNTL(12) = 1
INFOG (14) Number of memory compress = 60
Extra copies due to In-Place stacking = 954000
Elapsed time in factorization driver = 15.0269
Factorization time by clock_gettime(): 15.0402 s
Entering DMUMPS 5.8.2 from C interface with JOB = -2
executing #MPI = 32 and #OMP = 2
Your experiment path is /home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_5
To display your profiling results:
###########################################################################################################################################################
# LEVEL | REPORT | COMMAND #
###########################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_5 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_5 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_5 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_5 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_5 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_5 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_5 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_5 #
###########################################################################################################################################################
* [MAQAO] Info: Detected 64 Lprof instances in igk-0805.
If this is incorrect, rerun with number-processes-per-node=X
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 1 1489752 10319760
executing #MPI = 64 and #OMP = 2
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
=================================================
MUMPS compiled with option -Dmetis
MUMPS compiled with option -Dpord
MUMPS compiled with option -Dptscotch
MUMPS compiled with option -Dscotch
=================================================
L U Solver for unsymmetric matrices
Type of parallelism: Working host
****** ANALYSIS STEP ********
Processing a graph of size: 1489752
... Structural symmetry (in percent)= 100
Average density of rows/columns = 6
... No column permutation
Ordering based on METIS
ELAPSED TIME SPENT IN METIS reordering = 13.0909
SYMBOLIC based on column counts
ELAPSED TIME IN symbolic factorization = 0.3760
A root of estimated size 7524 has been selected for Scalapack.
Leaving analysis phase with ...
INFOG(1) = 0
INFOG(2) = 0
-- (20) Number of entries in factors (estim.) = 2059669522
-- (3) Real space for factors (estimated) = 2064897285
-- (4) Integer space for factors (estimated) = 20056357
-- (5) Maximum frontal size (estimated) = 11342
-- (6) Number of nodes in the tree = 136836
-- (32) Type of analysis effectively used = 1
-- (7) Ordering option effectively used = 5
ICNTL (6) Maximum transversal option = 0
ICNTL (7) Pivot order option = 7
ICNTL(12) Ordering symmetric indef. matrices = 1
ICNTL(13) Parallelism/splitting of root node = 0
ICNTL(14) Percentage of memory relaxation = 30
ICNTL(15) Analysis by block effectively used = 0
ICNTL(18) Distributed input matrix (on if >0) = 0
ICNTL(32) Forward elimination during facto. = 0
ICNTL(35) BLR activation = 0
ICNTL(48) Tree based multithreading (effective)= 1
ICNTL(58) Symbolic factorization option = 2
Number of level 2 nodes = 137
Number of split nodes = 16
RINFOG(1) Operations during elimination (estim)= 1.088D+13
MEMORY ESTIMATIONS ...
Estimations with standard Full-Rank (FR) factorization:
Maximum estim. space in Mbytes, IC facto. (INFOG(16)): 803
Total space in MBytes, IC factorization (INFOG(17)): 40410
Maximum estim. space in Mbytes, OOC facto. (INFOG(26)): 625
Total space in MBytes, OOC factorization (INFOG(27)): 30079
Elapsed time in analysis driver= 14.6668
Analysis time by clock_gettime(): 14.668 s
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 7 1489752 10319760
executing #MPI = 64 and #OMP = 2
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
Elapsed time in save structure driver= 0.0012
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
On return from DMUMPS, INFOG(1)= -71
On return from DMUMPS, INFOG(2)= 0
PRE FACTO START LPROF----------------------
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 2 1489752 10319760
executing #MPI = 64 and #OMP = 2
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
* [MAQAO] Info: STARTING COUNTERS (igk-0805)
[0m ** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
****** FACTORIZATION STEP ********
GLOBAL STATISTICS PRIOR NUMERICAL FACTORIZATION ...
Number of working processes = 64
ICNTL(22) Out-of-core option = 0
ICNTL(35) BLR activation (eff. choice) = 0
ICNTL(37) BLR CB compression (eff. choice) = 0
ICNTL(49) Compact workarray S (end facto.) = 0
ICNTL(56) Effective value during facto. = 0
ICNTL(14) Memory relaxation = 30
INFOG(3) Real space for factors (estimated)= 2064897285
INFOG(4) Integer space for factors (estim.)= 20056357
Maximum frontal size (estimated) = 11342
Number of nodes in the tree = 136836
ICNTL(23) Memory allowed (value on host) = 0
Sum over all procs = 0
Memory provided by user, sum of LWK_USER = 0
Effective threshold for pivoting, CNTL(1) = 0.1000D-01
Statistics on the scaling phase
Elapsed time for scaling = 0.1599
Max difference from 1 after scaling the entries for ONE-NORM (option 7/8) = 0.59D-02
Average Effective size of S (based on INFO(39))= 57430258
Elapsed time to reformat/distribute matrix = 0.3397
Allocated buffers
------------------
Size of reception buffer in bytes ...... = 4875872
Size of async. emission buffer (bytes).. = 14676371
Small emission buffer (bytes) .......... = 87764
** Memory allocated, max in Mbytes (INFOG(18)): 803
** Memory allocated, total in Mbytes (INFOG(19)): 40381
** Memory effectively used, max in Mbytes (INFOG(21)): 623
** Memory effectively used, total in Mbytes (INFOG(22)): 30630
Flops under L0 layer (avg/max across MPI) = 4.193D+09 1.047D+10
Elapsed time under L0 (avg/max across MPI) = 0.2448 0.4200
Elapsed time to process root node = 0.2686
Elapsed time for factorization = 9.2370
Leaving factorization with ...
RINFOG (2) Operations in node assembly = 6.335D+09
------ (3) Operations in node elimination = 1.088D+13
ICNTL (8) Scaling effectively used = 7
INFOG (9) Real space for factors = 2059580484
INFOG (10) Integer space for factors = 20057020
INFOG (11) Maximum front size = 11335
INFOG (29) Number of entries in factors = 2059580484
INFOG (12) Number of off diagonal pivots = 0
INFOG (13) Number of delayed pivots = 0
RINFOG(19) Smallest pivot WITH perturbed pivots = 3.245D-01
RINFOG(20) Smallest pivot WITHOUT perturbed pivots = 3.245D-01
RINFOG(21) Largest pivot in absolute value = 6.699D-01
INFOG (24) Effective value of ICNTL(12) = 1
INFOG (14) Number of memory compress = 136
Extra copies due to In-Place stacking = 1084035
Elapsed time in factorization driver = 9.7913
Factorization time by clock_gettime(): 9.8054 s
Entering DMUMPS 5.8.2 from C interface with JOB = -2
executing #MPI = 64 and #OMP = 2
Your experiment path is /home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_6
To display your profiling results:
###########################################################################################################################################################
# LEVEL | REPORT | COMMAND #
###########################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_6 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_6 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_6 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_6 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_6 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_6 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_6 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_6 #
###########################################################################################################################################################
* [MAQAO] Info: Detected 86 Lprof instances in igk-0805.
If this is incorrect, rerun with number-processes-per-node=X
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 1 1489752 10319760
executing #MPI = 86 and #OMP = 2
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
=================================================
MUMPS compiled with option -Dmetis
MUMPS compiled with option -Dpord
MUMPS compiled with option -Dptscotch
MUMPS compiled with option -Dscotch
=================================================
L U Solver for unsymmetric matrices
Type of parallelism: Working host
****** ANALYSIS STEP ********
Processing a graph of size: 1489752
... Structural symmetry (in percent)= 100
Average density of rows/columns = 6
... No column permutation
Ordering based on METIS
ELAPSED TIME SPENT IN METIS reordering = 13.0879
SYMBOLIC based on column counts
ELAPSED TIME IN symbolic factorization = 0.3825
A root of estimated size 7524 has been selected for Scalapack.
Leaving analysis phase with ...
INFOG(1) = 0
INFOG(2) = 0
-- (20) Number of entries in factors (estim.) = 2059669522
-- (3) Real space for factors (estimated) = 2069442628
-- (4) Integer space for factors (estimated) = 20803663
-- (5) Maximum frontal size (estimated) = 11342
-- (6) Number of nodes in the tree = 136847
-- (32) Type of analysis effectively used = 1
-- (7) Ordering option effectively used = 5
ICNTL (6) Maximum transversal option = 0
ICNTL (7) Pivot order option = 7
ICNTL(12) Ordering symmetric indef. matrices = 1
ICNTL(13) Parallelism/splitting of root node = 0
ICNTL(14) Percentage of memory relaxation = 30
ICNTL(15) Analysis by block effectively used = 0
ICNTL(18) Distributed input matrix (on if >0) = 0
ICNTL(32) Forward elimination during facto. = 0
ICNTL(35) BLR activation = 0
ICNTL(48) Tree based multithreading (effective)= 1
ICNTL(58) Symbolic factorization option = 2
Number of level 2 nodes = 164
Number of split nodes = 27
RINFOG(1) Operations during elimination (estim)= 1.088D+13
MEMORY ESTIMATIONS ...
Estimations with standard Full-Rank (FR) factorization:
Maximum estim. space in Mbytes, IC facto. (INFOG(16)): 577
Total space in MBytes, IC factorization (INFOG(17)): 41982
Maximum estim. space in Mbytes, OOC facto. (INFOG(26)): 434
Total space in MBytes, OOC factorization (INFOG(27)): 31516
Elapsed time in analysis driver= 14.7071
Analysis time by clock_gettime(): 14.709 s
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 7 1489752 10319760
executing #MPI = 86 and #OMP = 2
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
Elapsed time in save structure driver= 0.0016
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
On return from DMUMPS, INFOG(1)= -71
On return from DMUMPS, INFOG(2)= 0
PRE FACTO START LPROF----------------------
Entering DMUMPS 5.8.2 from C interface with JOB, N, NNZ = 2 1489752 10319760
executing #MPI = 86 and #OMP = 2
Advanced settings:
KEEP(370) Static mapping = 1
KEEP(371) Advanced optimizations = 0
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
** ERROR RETURN ** FROM DMUMPS INFO(1)= -71
** INFO(2)= 0
PRE FACTO START LPROF----------------------
* [MAQAO] Info: STARTING COUNTERS (igk-0805)
[0m
****** FACTORIZATION STEP ********
GLOBAL STATISTICS PRIOR NUMERICAL FACTORIZATION ...
Number of working processes = 86
ICNTL(22) Out-of-core option = 0
ICNTL(35) BLR activation (eff. choice) = 0
ICNTL(37) BLR CB compression (eff. choice) = 0
ICNTL(49) Compact workarray S (end facto.) = 0
ICNTL(56) Effective value during facto. = 0
ICNTL(14) Memory relaxation = 30
INFOG(3) Real space for factors (estimated)= 2069442628
INFOG(4) Integer space for factors (estim.)= 20803663
Maximum frontal size (estimated) = 11342
Number of nodes in the tree = 136847
ICNTL(23) Memory allowed (value on host) = 0
Sum over all procs = 0
Memory provided by user, sum of LWK_USER = 0
Effective threshold for pivoting, CNTL(1) = 0.1000D-01
Statistics on the scaling phase
Elapsed time for scaling = 0.2031
Max difference from 1 after scaling the entries for ONE-NORM (option 7/8) = 0.59D-02
Average Effective size of S (based on INFO(39))= 44741959
Elapsed time to reformat/distribute matrix = 0.3292
Allocated buffers
------------------
Size of reception buffer in bytes ...... = 3333684
Size of async. emission buffer (bytes).. = 10034389
Small emission buffer (bytes) .......... = 155360
** Memory allocated, max in Mbytes (INFOG(18)): 577
** Memory allocated, total in Mbytes (INFOG(19)): 41952
** Memory effectively used, max in Mbytes (INFOG(21)): 451
** Memory effectively used, total in Mbytes (INFOG(22)): 31108
Flops under L0 layer (avg/max across MPI) = 2.349D+09 5.651D+09
Elapsed time under L0 (avg/max across MPI) = 0.2366 0.3681
Elapsed time to process root node = 0.2371
Elapsed time for factorization = 9.0465
Leaving factorization with ...
RINFOG (2) Operations in node assembly = 7.333D+09
------ (3) Operations in node elimination = 1.088D+13
ICNTL (8) Scaling effectively used = 7
INFOG (9) Real space for factors = 2059498054
INFOG (10) Integer space for factors = 20800676
INFOG (11) Maximum front size = 11335
INFOG (29) Number of entries in factors = 2059498054
INFOG (12) Number of off diagonal pivots = 0
INFOG (13) Number of delayed pivots = 0
RINFOG(19) Smallest pivot WITH perturbed pivots = 3.245D-01
RINFOG(20) Smallest pivot WITHOUT perturbed pivots = 3.245D-01
RINFOG(21) Largest pivot in absolute value = 6.699D-01
INFOG (24) Effective value of ICNTL(12) = 1
INFOG (14) Number of memory compress = 214
Extra copies due to In-Place stacking = 1294909
Elapsed time in factorization driver = 9.6260
Factorization time by clock_gettime(): 9.6402 s
Entering DMUMPS 5.8.2 from C interface with JOB = -2
executing #MPI = 86 and #OMP = 2
Your experiment path is /home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_7
To display your profiling results:
###########################################################################################################################################################
# LEVEL | REPORT | COMMAND #
###########################################################################################################################################################
# Functions | Cluster-wide | maqao lprof -df xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_7 #
# Functions | Per-node | maqao lprof -df -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_7 #
# Functions | Per-process | maqao lprof -df -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_7 #
# Functions | Per-thread | maqao lprof -df -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_7 #
# Loops | Cluster-wide | maqao lprof -dl xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_7 #
# Loops | Per-node | maqao lprof -dl -dn xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_7 #
# Loops | Per-process | maqao lprof -dl -dp xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_7 #
# Loops | Per-thread | maqao lprof -dl -dt xp=/home/mlkaps_org/kevin/matrices/test_m1-86_o2_vass_atmosmodl_scala_kptr_probe/tools/lprof_run_7 #
###########################################################################################################################################################