Function: miniqmcreference::TwoBodyJastrowRef<qmcplusplus::BsplineFunctor<double> >::evalGrad(qmcplu ... | Module: exec | Source: VectorSoAContainer.h:231-231 [...] | Coverage: 0.01% |
---|
Function: miniqmcreference::TwoBodyJastrowRef<qmcplusplus::BsplineFunctor<double> >::evalGrad(qmcplu ... | Module: exec | Source: VectorSoAContainer.h:231-231 [...] | Coverage: 0.01% |
---|
/home/hbollore/qaas-runs/171-284-6744/intel/miniqmc/build/miniqmc/src/Numerics/OhmmsPETE/TinyVector.h: 145 - 145 |
-------------------------------------------------------------------------------- |
145: X[i] = base[i * offset]; |
/home/hbollore/qaas-runs/171-284-6744/intel/miniqmc/build/miniqmc/src/Numerics/OhmmsPETE/VectorSoAContainer.h: 231 - 231 |
-------------------------------------------------------------------------------- |
231: inline const AoSElement_t operator[](size_t i) const { return AoSElement_t(myData + i, nGhosts); } |
/home/hbollore/qaas-runs/171-284-6744/intel/miniqmc/build/miniqmc/src/QMCWaveFunctions/Jastrow/TwoBodyJastrowRef.h: 293 - 293 |
-------------------------------------------------------------------------------- |
293: return GradType(dUat[iat]); |
0x419820 LDR X8, [X0, #256] |
0x419824 LDR X9, [X0, #240] |
0x419828 ADD X8, X8, W2,SXTW #3 |
0x41982c SBFM X10, X9, #61, #31 |
0x419830 SBFM X9, X9, #60, #31 |
0x419834 LDR D0, [X8] |
0x419838 LDR D1, [X8, X10] |
0x41983c LDR D2, [X8, X9] |
0x419840 RET |
0x419844 HINT #0 |
0x419848 HINT #0 |
0x41984c HINT #0 |
0x419850 HINT #0 |
0x419854 HINT #0 |
0x419858 HINT #0 |
0x41985c HINT #0 |
Coverage (%) | Name | Source Location | Module |
---|
Path / |
Source file and lines | VectorSoAContainer.h:231-231 |
Module | exec |
nb instructions | 16 |
loop length | 64 |
nb stack references | 0 |
front end | 1.13 cycles |
P0 | P1 | P2 | P3 | P4 | P5 | P6 | P7 | P8 | P9 | P10 | P11 | P12 | P13 | P14 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
uops | 0.50 | 0.50 | 0.75 | 0.75 | 0.75 | 0.75 | 0.00 | 0.00 | 0.00 | 0.00 | 1.67 | 1.67 | 1.67 | 0.00 | 0.00 |
cycles | 0.50 | 0.50 | 0.75 | 0.75 | 0.75 | 0.75 | 0.00 | 0.00 | 0.00 | 0.00 | 1.67 | 1.67 | 1.67 | 0.00 | 0.00 |
Cycles executing div or sqrt instructions | NA |
Front-end | 1.13 |
Overall L1 | 1.67 |
all | 0% |
load | 0% |
store | NA (no store vectorizable/vectorized instructions) |
mul | NA (no mul vectorizable/vectorized instructions) |
add-sub | NA (no add-sub vectorizable/vectorized instructions) |
fma | NA (no fma vectorizable/vectorized instructions) |
div/sqrt | NA (no div/sqrt vectorizable/vectorized instructions) |
other | NA (no other vectorizable/vectorized instructions) |
Instruction | Nb FU | P0 | P1 | P2 | P3 | P4 | P5 | P6 | P7 | P8 | P9 | P10 | P11 | P12 | P13 | P14 | Latency | Recip. throughput |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
LDR X8, [X0, #256] | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.33 | 0.33 | 0.33 | 0 | 0 | 4 | 0.33 |
LDR X9, [X0, #240] | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.33 | 0.33 | 0.33 | 0 | 0 | 4 | 0.33 |
ADD X8, X8, W2,SXTW #3 | 1 | 0 | 0 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0.50 |
SBFM X10, X9, #61, #31 | 1 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.25 |
SBFM X9, X9, #60, #31 | 1 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.25 |
LDR D0, [X8] | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.33 | 0.33 | 0.33 | 0 | 0 | 6 | 0.33 |
LDR D1, [X8, X10] | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.33 | 0.33 | 0.33 | 0 | 0 | 6 | 0.33 |
LDR D2, [X8, X9] | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.33 | 0.33 | 0.33 | 0 | 0 | 6 | 0.33 |
RET | 1 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.50 |
HINT #0 | ||||||||||||||||||
HINT #0 | ||||||||||||||||||
HINT #0 | ||||||||||||||||||
HINT #0 | ||||||||||||||||||
HINT #0 | ||||||||||||||||||
HINT #0 | ||||||||||||||||||
HINT #0 |
Source file and lines | VectorSoAContainer.h:231-231 |
Module | exec |
nb instructions | 16 |
loop length | 64 |
nb stack references | 0 |
front end | 1.13 cycles |
P0 | P1 | P2 | P3 | P4 | P5 | P6 | P7 | P8 | P9 | P10 | P11 | P12 | P13 | P14 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
uops | 0.50 | 0.50 | 0.75 | 0.75 | 0.75 | 0.75 | 0.00 | 0.00 | 0.00 | 0.00 | 1.67 | 1.67 | 1.67 | 0.00 | 0.00 |
cycles | 0.50 | 0.50 | 0.75 | 0.75 | 0.75 | 0.75 | 0.00 | 0.00 | 0.00 | 0.00 | 1.67 | 1.67 | 1.67 | 0.00 | 0.00 |
Cycles executing div or sqrt instructions | NA |
Front-end | 1.13 |
Overall L1 | 1.67 |
all | 0% |
load | 0% |
store | NA (no store vectorizable/vectorized instructions) |
mul | NA (no mul vectorizable/vectorized instructions) |
add-sub | NA (no add-sub vectorizable/vectorized instructions) |
fma | NA (no fma vectorizable/vectorized instructions) |
div/sqrt | NA (no div/sqrt vectorizable/vectorized instructions) |
other | NA (no other vectorizable/vectorized instructions) |
Instruction | Nb FU | P0 | P1 | P2 | P3 | P4 | P5 | P6 | P7 | P8 | P9 | P10 | P11 | P12 | P13 | P14 | Latency | Recip. throughput |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
LDR X8, [X0, #256] | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.33 | 0.33 | 0.33 | 0 | 0 | 4 | 0.33 |
LDR X9, [X0, #240] | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.33 | 0.33 | 0.33 | 0 | 0 | 4 | 0.33 |
ADD X8, X8, W2,SXTW #3 | 1 | 0 | 0 | 0 | 0 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0.50 |
SBFM X10, X9, #61, #31 | 1 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.25 |
SBFM X9, X9, #60, #31 | 1 | 0 | 0 | 0.25 | 0.25 | 0.25 | 0.25 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.25 |
LDR D0, [X8] | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.33 | 0.33 | 0.33 | 0 | 0 | 6 | 0.33 |
LDR D1, [X8, X10] | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.33 | 0.33 | 0.33 | 0 | 0 | 6 | 0.33 |
LDR D2, [X8, X9] | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.33 | 0.33 | 0.33 | 0 | 0 | 6 | 0.33 |
RET | 1 | 0.50 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.50 |
HINT #0 | ||||||||||||||||||
HINT #0 | ||||||||||||||||||
HINT #0 | ||||||||||||||||||
HINT #0 | ||||||||||||||||||
HINT #0 | ||||||||||||||||||
HINT #0 | ||||||||||||||||||
HINT #0 |
Name | Coverage (%) | Time (s) |
---|---|---|
○miniqmcreference::TwoBodyJastrowRef | 0.01 | 0.01 |