options

cp2k.psmp - 2023-07-11 15:11:02 - MAQAO 2.17.5

Help is available by moving the cursor above any symbol or by checking MAQAO website.

Global Metrics

Total Time (s)613.40
Profiled Time (s)564.75
Time in analyzed loops (%)88.2
Time in analyzed innermost loops (%)54.9
Time in user code (%)77.5
Compilation Options Score (%)94.1
Perfect Flow Complexity1.11
Array Access Efficiency (%)74.2
GFLOPS112.434
Perfect OpenMP + MPI + Pthread1.02
Perfect OpenMP + MPI + Pthread + Perfect Load Distribution1.06
No Scalar IntegerPotential Speedup1.15
Nb Loops to get 80%11
FP VectorisedPotential Speedup1.28
Nb Loops to get 80%13
Fully VectorisedPotential Speedup3.15
Nb Loops to get 80%41
FP Arithmetic OnlyPotential Speedup1.69
Nb Loops to get 80%41

CQA Potential Speedups Summary

Loop Based Profile

Innermost Loop Based Profile

Application Categorization

Compilation Options

Source ObjectIssue
cp2k.psmp
ialltoall_intra_sched_permuted_sendrecv.c-x(target) or -ax(target) is missing.
dbcsr_api.F
ai_contraction.F
sendrecvf.c-x(target) or -ax(target) is missing.
alltoallv_allcomm_auto.c-x(target) or -ax(target) is missing.
mpidu_sched.c-x(target) or -ax(target) is missing.
aux_basis_set.F
dbcsr_list_routinereport.F
dbcsr_mp_operations.F
dbcsr_mm_cannon.F
cp_fm_diag_utils.F
admm_dm_methods.F
molecule_types.F
xc_derivative_set_types.F
contextid.c-x(target) or -ax(target) is missing.
ai_coulomb_test.F
lri_environment_methods.F
i_mpi_memcpy_sse.h-x(target) or -ax(target) is missing.
semi_empirical_expns3_types.F
xc_ke_gga.F
pmi2_virtualization.c-x(target) or -ax(target) is missing.
pmi_virtualization.c-x(target) or -ax(target) is missing.
dbcsr_acc_stream.F
iallreduce_intra_sched_reduce_scatter_allgather.c-x(target) or -ax(target) is missing.
atom_fit.F
basis_set_types.F
xc_xwpbe.F
mpit.c-x(target) or -ax(target) is missing.
cp_min_heap.F
cp_dbcsr_operations.F
alltoallv_inter_pairwise_exchange.c-x(target) or -ax(target) is missing.
ai_onecenter.F
ofi_am_events.h-x(target) or -ax(target) is missing.
neighbor_alltoallw_allcomm_auto.c-x(target) or -ax(target) is missing.
fft_tools.F
impi_bitmap_sse.c-x(target) or -ax(target) is missing.
json_util.c-x(target) or -ax(target) is missing.
atom_utils.F
iallreduce_intra_sched_ring.c-x(target) or -ax(target) is missing.
dbcsr_iter_types.F
cp_fm_types.F
json_object.c-x(target) or -ax(target) is missing.
cp_array_utils.F
dbm_mpi.c
task_list_methods.F
cp_linked_list_xc_deriv.F
dbcsr_list_timerenv.F
ialltoall_intra_sched_inplace.c-x(target) or -ax(target) is missing.
grid_api.F
pw_methods.F
external_potential_types.F
autoreg_coll_tree_json.h-x(target) or -ax(target) is missing.
grid_context_cpu.c
hcoll_init.c-x(target) or -ax(target) is missing.
dbcsr_ptr_util.F
force_field_kind_types.F
cp_blacs_env.F
mpir_pmi.c-x(target) or -ax(target) is missing.
xc_util.F
ialltoallv_intra_sched_blocked.c-x(target) or -ax(target) is missing.
bitmap.c-x(target) or -ax(target) is missing.
dbcsr_acc_hostmem.F
mathlib.F
alltoall_intra_brucks.c-x(target) or -ax(target) is missing.
qs_ks_utils.F
dbcsr_acc_device.F
force_env_types.F
shm_hooks.c-x(target) or -ax(target) is missing.
ialltoall_intra_sched_brucks.c-x(target) or -ax(target) is missing.
ai_contraction_sphi.F
pm_recognition.c-x(target) or -ax(target) is missing.
dbcsr_mpiwrap.F
cp_linked_list_input.F
pmi1_virtualization.c-x(target) or -ax(target) is missing.
bind.c-x(target) or -ax(target) is missing.
alltoall_intra_pairwise_sendrecv_replace.c-x(target) or -ax(target) is missing.
posix_eager_impl.c-x(target) or -ax(target) is missing.
recv.c-x(target) or -ax(target) is missing.
ialltoall_intra_sched_pairwise.c-x(target) or -ax(target) is missing.
dbcsr_data_methods_low.F
setbot.c-x(target) or -ax(target) is missing.
ch4i_workq.h-x(target) or -ax(target) is missing.

Loop Path Count Profile

Cumulated Speedup If No Scalar Integer

Cumulated Speedup If FP Vectorized

Cumulated Speedup If Fully Vectorized

Cumulated Speedup If FP Arithmetic Only

Experiment Summary

Application/scratch/eoseret/cp2k-2023.1/exe/Linux-intel-x86_64-minimal/cp2k.psmp
Timestamp2023-07-11 15:11:02 Universal Timestamp1689081062
Number of processes observed52 Number of threads observed52
Experiment TypeMPI; OpenMP;
Machineskylake
Model NameIntel(R) Xeon(R) Platinum 8170 CPU @ 2.10GHz
Architecturex86_64 Micro ArchitectureSKYLAKE
Cache Size36608 KB Number of Cores26
OS VersionLinux 6.4.1-arch2-1 #1 SMP PREEMPT_DYNAMIC Tue, 04 Jul 2023 08:39:40 +0000
Architecture used during static analysisx86_64 Micro Architecture used during static analysisSKYLAKE
Frequency Driverintel_cpufreq Frequency Governorschedutil
Huge Pagesalways Hyperthreadingoff
Number of sockets2 Number of cores per socket26
Compilation Options
cp2k.psmp: Intel(R) C Intel(R) 64 Compiler Classic for applications running on Intel(R) 64, Version 2021.8.0 Build 20221119_000000 -I/opt/intel/oneapi/mpi/2021.8.0/include -c -O2 -fopenmp -fp-model precise -funroll-loops -g -fno-omit-frame-pointer -fcf-protection=none -no-pie -qopenmp-simd -traceback -xHost
CommentsECR skylake, cp2k Intel ifort/MKL/MPI 2023.0, 52 MPI ranks, OMP_NUM_THREADS=1, (SVP / no kpoints) dataset

Configuration Summary

Dataset
Run Command<executable> -i mol22_s1.inp
MPI Commandmpirun -n <number_processes>
Number Processes52
Number Nodes1
Number Processes per Nodes52
FilterNot Used
Profile StartNot Used
Maximal Path Number4
×