Help is available by moving the cursor above any
symbol or by checking MAQAO website.
| Total Time (s) | 65.15 | ||
| Max (Thread Active Time) (s) | 8.58 | ||
| Average Active Time (s) | 8.54 | ||
| Activity Ratio (%) | 26.9 | ||
| Average number of active threads | 16.775 | ||
| Affinity Stability (%) | 27.0 | ||
| Time in analyzed loops (%) | 4.34 | ||
| Time in analyzed innermost loops (%) | 2.61 | ||
| Time in user code (%) | 5.54 | ||
| Compilation Options Score (%) | 99.9 | ||
| Array Access Efficiency (%) | 96.2 | ||
| Potential Speedups | |||
| Perfect Flow Complexity | 1.00 | ||
| Perfect OpenMP + MPI + Pthread | 1.00 | ||
| Perfect OpenMP + MPI + Pthread + Perfect Load Distribution | 1.00 | ||
| No Scalar Integer | Potential Speedup | 1.00 | |
| Nb Loops to get 80% | 2 | ||
| FP Vectorised | Potential Speedup | 1.00 | |
| Nb Loops to get 80% | 2 | ||
| Fully Vectorised | Potential Speedup | 1.02 | |
| Nb Loops to get 80% | 1 | ||
| FP Arithmetic Only | Potential Speedup | 1.00 | |
| Nb Loops to get 80% | 5 | ||
| Source Object | Issue |
|---|---|
| ▼[vdso] | |
| ▼ | |
| ○ | -g is missing for some functions (possibly ones added by the compiler), it is needed to have more accurate reports. Other recommended flags are: -O2/-O3, -march=(target) |
| ○ | -O2, -O3 or -Ofast is missing. |
| ○ | -march=(target) is missing. |
| ▼libfinite_elements.so | |
| ○InverseImpl.h | |
| ○element_U.tpp | |
| ○TensorMap.h | |
| ○GeneralProduct.h | |
| ○generic_elements.hpp | |
| ○stl_vector.h | |
| ○AssignEvaluator.h | |
| ○TensorDeviceDefault.h | |
| ○MapBase.h | |
| ○PlainObjectBase.h | |
| ▼libdofs.so | |
| ○dof_list.cpp | |
| ○dof.cpp | |
| ○MapBase.h | |
| ○stl_vector.h | |
| ▼multithreading_assembly_perf_test | |
| ○enumerable_thread_specific.h | |
| ○finite_elements.hpp | |
| ○assembler.hpp | |
| ▼libnon_linear_solvers.so | |
| ▼ | |
| ○ | -g is missing for some functions (possibly ones added by the compiler), it is needed to have more accurate reports. Other recommended flags are: -O2/-O3, -march=(target) |
| ○ | -O2, -O3 or -Ofast is missing. |
| ○ | -march=(target) is missing. |
| Application | ./multithreading_assembly_perf_test | ||||
| Timestamp | 2025-05-20 10:57:29 | Universal Timestamp | 1747731449 | ||
| Number of processes observed | 1 | Number of threads observed | 128 | ||
| Experiment Type | MPI; OpenMP; | ||||
| Machine | be-seq033 | ||||
| Model Name | AMD EPYC 9534 64-Core Processor | ||||
| Architecture | x86_64 | Micro Architecture | ZEN_V4 | ||
| Cache Size | 1024 KB | Number of Cores | 64 | ||
| OS Version | Linux 4.18.0-477.10.1.el8_8.x86_64 #1 SMP Wed Apr 5 13:35:01 EDT 2023 | ||||
| Architecture used during static analysis | x86_64 | Micro Architecture used during static analysis | ZEN_V4 | ||
| Frequency Driver | acpi-cpufreq | Frequency Governor | performance | ||
| Huge Pages | always | Hyperthreading | off | ||
| Number of sockets | 2 | Number of cores per socket | 64 | ||
| Compilation Options | + [vdso]: N/A libdofs.so: GNU C++20 13.2.0 -march=znver4 -g3 -O3 -std=c++20 -fno-omit-frame-pointer -fopenmp -funroll-loops -fPIC libfinite_elements.so: GNU C++20 13.2.0 -march=znver4 -g3 -O3 -std=c++20 -fno-omit-frame-pointer -fopenmp -funroll-loops -fPIC libnon_linear_solvers.so: N/A multithreading_assembly_perf_test: GNU C++20 13.2.0 -march=znver4 -g3 -O3 -std=c++20 -fno-omit-frame-pointer -fopenmp -funroll-loops | ||||
| Dataset | |
| Run Command | <executable> --max_threads <OMP_NUM_THREADS> --ncut 200 --method ColMutexes --storage SparseCOO |
| MPI Command | mpirun -n <number_processes> --map-by slot:PE=<OMP_NUM_THREADS> --bind-to core |
| Number Processes | 1 |
| Number Nodes | 1 |
| Filter | Not Used |
| Profile Start | Not Used |