Help is available by moving the cursor above any symbol or by checking MAQAO website.
▶Compared Reports
r0: tbb_1
r1: tbb_2
r2: tbb_4
r3: tbb_8
r4: tbb_16
r5: tbb_32
r6: tbb_64
r7: tbb_128
Global Metrics
Metric
r0
r1
r2
r3
r4
r5
r6
r7
Total Time (s)
209.44
131.78
90.89
69.66
61.47
58.83
58.00
63.66
Max (Thread Active Time) (s)
134.53
69.16
35.25
18.21
10.62
7.70
7.17
8.61
Average Active Time (s)
134.53
69.11
35.12
18.18
10.58
7.68
7.14
8.48
Activity Ratio (%)
64.2
59.7
52.5
43.2
33.4
28.6
27.0
27.2
Average number of active threads
0.642
1.049
1.546
2.088
2.754
4.175
7.881
17.050
Affinity Stability (%)
64.4
59.9
52.5
43.2
33.5
28.7
27.1
27.5
Time in analyzed loops (%)
27.0
27.8
27.8
29.1
25.4
17.8
9.42
4.30
Time in analyzed innermost loops (%)
16.1
16.8
16.7
17.7
15.4
10.8
5.61
2.57
Time in user code (%)
34.3
35.1
35.4
36.3
31.9
22.5
12.0
5.53
Compilation Options Score (%)
100
100
100
100
100
100.0
100.0
99.9
Array Access Efficiency (%)
95.3
95.7
95.6
96.0
96.0
95.8
96.0
96.0
Potential Speedups
Perfect Flow Complexity
1.01
1.01
1.01
1.01
1.01
1.01
1.00
1.00
Perfect OpenMP + MPI + Pthread
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
Perfect OpenMP + MPI + Pthread + Perfect Load Distribution
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.01
Scalability - Gap
1.00
1.26
1.74
2.66
4.70
8.99
17.72
38.91
No Scalar Integer
Potential Speedup
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
Nb Loops to get 80%
3
3
3
3
3
3
3
2
FP Vectorised
Potential Speedup
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
Nb Loops to get 80%
2
2
2
2
2
2
2
2
Fully Vectorised
Potential Speedup
1.13
1.14
1.13
1.15
1.13
1.09
1.04
1.02
Nb Loops to get 80%
2
2
2
2
2
2
1
1
Only FP Arithmetic
Potential Speedup
1.03
1.03
1.03
1.03
1.03
1.02
1.01
1.00
Nb Loops to get 80%
5
5
5
5
5
5
5
5
Scalability Speedup
Cumulated Speedup If No Scalar Integer
Cumulated Speedup If FP Vectorized
Cumulated Speedup If Fully Vectorized
Cumulated Speedup If Only FP Arithmetic
Loop Based Profiles
Innermost / Single Loops
Inbetween Loops
Outermost Loops
Cumulated Coverage With All Loops
Innermost Loop Based Profiles
Coverage
Count
Application Categorization
Time
r0
r1
r2
r3
r4
r5
r6
r7
0.00
20.00
40.00
60.00
80.00
100.00
120.00
140.00
160.00
Binary
Math
libdofs.so
Others
System
Memory
libnon_linear_solvers.so
libfinite_elements.so
Pthread
r0
r1
r2
r3
r4
r5
r6
r7
0.00
20.00
40.00
60.00
80.00
100.00
120.00
140.00
160.00
Binary
Math
libdofs.so
Others
System
Memory
libnon_linear_solvers.so
libfinite_elements.so
Pthread
Coverage
r0
r1
r2
r3
r4
r5
r6
r7
0.00
16.67
33.33
50.00
66.67
83.33
100.00
Binary
Math
libdofs.so
Others
System
Memory
libnon_linear_solvers.so
libfinite_elements.so
Pthread
r0
r1
r2
r3
r4
r5
r6
r7
0.00
16.67
33.33
50.00
66.67
83.33
100.00
Binary
Math
libdofs.so
Others
System
Memory
libnon_linear_solvers.so
libfinite_elements.so
Pthread
Compilation Options
Source Object
Issue
▼[vdso]–
▼–
○
-g is missing for some functions (possibly ones added by the compiler), it is needed to have more accurate reports. Other recommended flags are: -O2/-O3, -march=(target)
○
-O2, -O3 or -Ofast is missing.
○
-march=(target) is missing.
▼libfinite_elements.so–
▼MapBase.h–
○
▼element_U.tpp–
○
▼TensorMap.h–
○
▼TensorDeviceDefault.h–
○
▼AssignEvaluator.h–
○
▼GeneralProduct.h–
○
▼generic_elements.hpp–
○
▼stl_vector.h–
○
▼DenseStorage.h–
○
▼material_brick.hpp–
○
▼InverseImpl.h–
○
▼Memory.h–
○
▼PlainObjectBase.h–
○
▼libdofs.so–
▼MapBase.h–
○
▼stl_vector.h–
○
▼dof_list.cpp–
○
▼stl_iterator.h–
○
▼dof.cpp–
○
▼multithreading_assembly_perf_test–
▼finite_elements.hpp–
○
▼basic_string.tcc–
○
▼enumerable_thread_specific.h–
○
▼assembler.hpp–
○
▼parallel_for.h–
○
▼libnon_linear_solvers.so–
▼DenseStorage.h–
○
Source Object
Issue
▼[vdso]–
▼–
○
-g is missing for some functions (possibly ones added by the compiler), it is needed to have more accurate reports. Other recommended flags are: -O2/-O3, -march=(target)
○
-O2, -O3 or -Ofast is missing.
○
-march=(target) is missing.
▼libfinite_elements.so–
▼MapBase.h–
○
▼element_U.tpp–
○
▼TensorMap.h–
○
▼TensorDeviceDefault.h–
○
▼AssignEvaluator.h–
○
▼GeneralProduct.h–
○
▼generic_elements.hpp–
○
▼stl_vector.h–
○
▼DenseStorage.h–
○
▼material_brick.hpp–
○
▼InverseImpl.h–
○
▼Memory.h–
○
▼PlainObjectBase.h–
○
▼libdofs.so–
▼MapBase.h–
○
▼stl_vector.h–
○
▼dof_list.cpp–
○
▼stl_iterator.h–
○
▼dof.cpp–
○
▼multithreading_assembly_perf_test–
▼finite_elements.hpp–
○
▼basic_string.tcc–
○
▼enumerable_thread_specific.h–
○
▼assembler.hpp–
○
▼parallel_for.h–
○
▼libnon_linear_solvers.so–
▼DenseStorage.h–
○
Source Object
Issue
▼[vdso]–
▼–
○
-g is missing for some functions (possibly ones added by the compiler), it is needed to have more accurate reports. Other recommended flags are: -O2/-O3, -march=(target)
○
-O2, -O3 or -Ofast is missing.
○
-march=(target) is missing.
▼libfinite_elements.so–
▼MapBase.h–
○
▼element_U.tpp–
○
▼TensorMap.h–
○
▼TensorDeviceDefault.h–
○
▼AssignEvaluator.h–
○
▼GeneralProduct.h–
○
▼generic_elements.hpp–
○
▼stl_vector.h–
○
▼DenseStorage.h–
○
▼material_brick.hpp–
○
▼InverseImpl.h–
○
▼Memory.h–
○
▼PlainObjectBase.h–
○
▼libdofs.so–
▼MapBase.h–
○
▼stl_vector.h–
○
▼dof_list.cpp–
○
▼stl_iterator.h–
○
▼dof.cpp–
○
▼multithreading_assembly_perf_test–
▼finite_elements.hpp–
○
▼basic_string.tcc–
○
▼enumerable_thread_specific.h–
○
▼assembler.hpp–
○
▼parallel_for.h–
○
▼libnon_linear_solvers.so–
▼DenseStorage.h–
○
Source Object
Issue
▼[vdso]–
▼–
○
-g is missing for some functions (possibly ones added by the compiler), it is needed to have more accurate reports. Other recommended flags are: -O2/-O3, -march=(target)
○
-O2, -O3 or -Ofast is missing.
○
-march=(target) is missing.
▼libfinite_elements.so–
▼MapBase.h–
○
▼element_U.tpp–
○
▼TensorMap.h–
○
▼TensorDeviceDefault.h–
○
▼AssignEvaluator.h–
○
▼GeneralProduct.h–
○
▼generic_elements.hpp–
○
▼stl_vector.h–
○
▼DenseStorage.h–
○
▼material_brick.hpp–
○
▼InverseImpl.h–
○
▼Memory.h–
○
▼PlainObjectBase.h–
○
▼libdofs.so–
▼MapBase.h–
○
▼stl_vector.h–
○
▼dof_list.cpp–
○
▼stl_iterator.h–
○
▼dof.cpp–
○
▼multithreading_assembly_perf_test–
▼finite_elements.hpp–
○
▼basic_string.tcc–
○
▼enumerable_thread_specific.h–
○
▼assembler.hpp–
○
▼parallel_for.h–
○
▼libnon_linear_solvers.so–
▼DenseStorage.h–
○
Source Object
Issue
▼[vdso]–
▼–
○
-g is missing for some functions (possibly ones added by the compiler), it is needed to have more accurate reports. Other recommended flags are: -O2/-O3, -march=(target)
○
-O2, -O3 or -Ofast is missing.
○
-march=(target) is missing.
▼libfinite_elements.so–
▼MapBase.h–
○
▼element_U.tpp–
○
▼TensorMap.h–
○
▼TensorDeviceDefault.h–
○
▼AssignEvaluator.h–
○
▼GeneralProduct.h–
○
▼generic_elements.hpp–
○
▼stl_vector.h–
○
▼DenseStorage.h–
○
▼material_brick.hpp–
○
▼InverseImpl.h–
○
▼Memory.h–
○
▼PlainObjectBase.h–
○
▼libdofs.so–
▼MapBase.h–
○
▼stl_vector.h–
○
▼dof_list.cpp–
○
▼stl_iterator.h–
○
▼dof.cpp–
○
▼multithreading_assembly_perf_test–
▼finite_elements.hpp–
○
▼basic_string.tcc–
○
▼enumerable_thread_specific.h–
○
▼assembler.hpp–
○
▼parallel_for.h–
○
▼libnon_linear_solvers.so–
▼DenseStorage.h–
○
Source Object
Issue
▼[vdso]–
▼–
○
-g is missing for some functions (possibly ones added by the compiler), it is needed to have more accurate reports. Other recommended flags are: -O2/-O3, -march=(target)
○
-O2, -O3 or -Ofast is missing.
○
-march=(target) is missing.
▼libfinite_elements.so–
▼MapBase.h–
○
▼element_U.tpp–
○
▼TensorMap.h–
○
▼TensorDeviceDefault.h–
○
▼AssignEvaluator.h–
○
▼GeneralProduct.h–
○
▼generic_elements.hpp–
○
▼stl_vector.h–
○
▼DenseStorage.h–
○
▼material_brick.hpp–
○
▼InverseImpl.h–
○
▼Memory.h–
○
▼PlainObjectBase.h–
○
▼libdofs.so–
▼MapBase.h–
○
▼stl_vector.h–
○
▼dof_list.cpp–
○
▼stl_iterator.h–
○
▼dof.cpp–
○
▼multithreading_assembly_perf_test–
▼finite_elements.hpp–
○
▼basic_string.tcc–
○
▼enumerable_thread_specific.h–
○
▼assembler.hpp–
○
▼parallel_for.h–
○
▼libnon_linear_solvers.so–
▼DenseStorage.h–
○
Source Object
Issue
▼[vdso]–
▼–
○
-g is missing for some functions (possibly ones added by the compiler), it is needed to have more accurate reports. Other recommended flags are: -O2/-O3, -march=(target)
○
-O2, -O3 or -Ofast is missing.
○
-march=(target) is missing.
▼libfinite_elements.so–
▼MapBase.h–
○
▼element_U.tpp–
○
▼TensorMap.h–
○
▼TensorDeviceDefault.h–
○
▼AssignEvaluator.h–
○
▼GeneralProduct.h–
○
▼generic_elements.hpp–
○
▼stl_vector.h–
○
▼DenseStorage.h–
○
▼material_brick.hpp–
○
▼InverseImpl.h–
○
▼Memory.h–
○
▼PlainObjectBase.h–
○
▼libdofs.so–
▼MapBase.h–
○
▼stl_vector.h–
○
▼dof_list.cpp–
○
▼stl_iterator.h–
○
▼dof.cpp–
○
▼multithreading_assembly_perf_test–
▼finite_elements.hpp–
○
▼basic_string.tcc–
○
▼enumerable_thread_specific.h–
○
▼assembler.hpp–
○
▼parallel_for.h–
○
▼libnon_linear_solvers.so–
▼DenseStorage.h–
○
Source Object
Issue
▼[vdso]–
▼–
○
-g is missing for some functions (possibly ones added by the compiler), it is needed to have more accurate reports. Other recommended flags are: -O2/-O3, -march=(target)
○
-O2, -O3 or -Ofast is missing.
○
-march=(target) is missing.
▼libfinite_elements.so–
▼MapBase.h–
○
▼element_U.tpp–
○
▼TensorMap.h–
○
▼TensorDeviceDefault.h–
○
▼AssignEvaluator.h–
○
▼GeneralProduct.h–
○
▼generic_elements.hpp–
○
▼stl_vector.h–
○
▼DenseStorage.h–
○
▼material_brick.hpp–
○
▼InverseImpl.h–
○
▼Memory.h–
○
▼PlainObjectBase.h–
○
▼libdofs.so–
▼MapBase.h–
○
▼stl_vector.h–
○
▼dof_list.cpp–
○
▼stl_iterator.h–
○
▼dof.cpp–
○
▼multithreading_assembly_perf_test–
▼finite_elements.hpp–
○
▼basic_string.tcc–
○
▼enumerable_thread_specific.h–
○
▼assembler.hpp–
○
▼parallel_for.h–
○
▼libnon_linear_solvers.so–
▼DenseStorage.h–
○
Path Count Profiles
Coverage
Count
Low Iteration Count Profiles
Coverage
Count
Average Number of Active Threads
Run 1 - tbb_1
Run 2 - tbb_2
Run 3 - tbb_4
Run 4 - tbb_8
Run 5 - tbb_16
Run 6 - tbb_32
Run 7 - tbb_64
Run 8 - tbb_128
Experiment Summaries
r0
r1
r2
r3
r4
r5
r6
r7
Application
./multithreading_assembly_perf_test
same as r0
same as r0
same as r0
same as r0
same as r0
same as r0
same as r0
Timestamp
2025-05-20 11:59:19
same as r0
same as r0
same as r0
same as r0
same as r0
same as r0
same as r0
Experiment Type
MPI;
MPI; OpenMP;
same as r1
same as r1
same as r1
same as r1
same as r1
same as r1
Machine
be-seq028
same as r0
same as r0
same as r0
same as r0
same as r0
same as r0
same as r0
Architecture
x86_64
same as r0
same as r0
same as r0
same as r0
same as r0
same as r0
same as r0
Micro Architecture
ZEN_V4
same as r0
same as r0
same as r0
same as r0
same as r0
same as r0
same as r0
Model Name
AMD EPYC 9534 64-Core Processor
same as r0
same as r0
same as r0
same as r0
same as r0
same as r0
same as r0
Cache Size
1024 KB
same as r0
same as r0
same as r0
same as r0
same as r0
same as r0
same as r0
Number of Cores
64
same as r0
same as r0
same as r0
same as r0
same as r0
same as r0
same as r0
Maximal Frequency
3.718066 GHz
same as r0
same as r0
same as r0
same as r0
same as r0
same as r0
same as r0
OS Version
Linux 4.18.0-477.10.1.el8_8.x86_64 #1 SMP Wed Apr 5 13:35:01 EDT 2023