Function: hypre_BinarySearch | Module: exec | Source: binsearch.c:29-53 [...] | Coverage: 0.01% |
---|
Function: hypre_BinarySearch | Module: exec | Source: binsearch.c:29-53 [...] | Coverage: 0.01% |
---|
/scratch_na/users/xoserete/qaas_runs/171-415-3872/intel/AMG/build/AMG/AMG/utilities/binsearch.c: 29 - 53 |
-------------------------------------------------------------------------------- |
29: { |
30: HYPRE_Int low, high, m; |
31: HYPRE_Int not_found = 1; |
32: |
33: low = 0; |
34: high = list_length-1; |
35: while (not_found && low <= high) |
36: { |
37: m = (low + high) / 2; |
38: if (value < list[m]) |
39: { |
40: high = m - 1; |
41: } |
42: else if (value > list[m]) |
43: { |
44: low = m + 1; |
[...] |
53: } |
0x4e5140 PUSH %RBP |
0x4e5141 MOV %RSP,%RBP |
0x4e5144 MOV $-0x1,%RAX |
0x4e514b TEST %RDX,%RDX |
0x4e514e JLE 4e51c1 |
0x4e5150 DEC %RDX |
0x4e5153 XOR %ECX,%ECX |
0x4e5155 JMP 4e516b |
0x4e5157 NOPW (%RAX,%RAX,1) |
(4422) 0x4e5160 DEC %R8 |
(4422) 0x4e5163 MOV %R8,%RDX |
(4422) 0x4e5166 CMP %RDX,%RCX |
(4422) 0x4e5169 JG 4e51c1 |
(4422) 0x4e516b LEA (%RDX,%RCX,1),%R9 |
(4422) 0x4e516f MOV %R9,%R8 |
(4422) 0x4e5172 SHR $0x3f,%R8 |
(4422) 0x4e5176 ADD %R9,%R8 |
(4422) 0x4e5179 SAR $0x1,%R8 |
(4422) 0x4e517c CMP %RSI,(%RDI,%R8,8) |
(4422) 0x4e5180 JLE 4e5190 |
(4422) 0x4e5182 DEC %R8 |
(4422) 0x4e5185 MOV %R8,%RDX |
(4422) 0x4e5188 CMP %RDX,%RCX |
(4422) 0x4e518b JLE 4e519d |
0x4e518d JMP 4e51c1 |
0x4e518f NOP |
(4422) 0x4e5190 JGE 4e51c3 |
(4422) 0x4e5192 INC %R8 |
(4422) 0x4e5195 MOV %R8,%RCX |
(4422) 0x4e5198 CMP %RDX,%RCX |
(4422) 0x4e519b JG 4e51c1 |
(4422) 0x4e519d LEA (%RDX,%RCX,1),%R9 |
(4422) 0x4e51a1 MOV %R9,%R8 |
(4422) 0x4e51a4 SHR $0x3f,%R8 |
(4422) 0x4e51a8 ADD %R9,%R8 |
(4422) 0x4e51ab SAR $0x1,%R8 |
(4422) 0x4e51ae CMP %RSI,(%RDI,%R8,8) |
(4422) 0x4e51b2 JG 4e5160 |
(4422) 0x4e51b4 JGE 4e51c3 |
(4422) 0x4e51b6 INC %R8 |
(4422) 0x4e51b9 MOV %R8,%RCX |
(4422) 0x4e51bc CMP %RDX,%RCX |
(4422) 0x4e51bf JLE 4e516b |
0x4e51c1 POP %RBP |
0x4e51c2 RET |
0x4e51c3 MOV %R8,%RAX |
0x4e51c6 POP %RBP |
0x4e51c7 RET |
0x4e51c8 NOPL (%RAX,%RAX,1) |
Coverage (%) | Name | Source Location | Module |
---|---|---|---|
►22.73+ | hypre_IJMatrixAssembleParCSR.e[...] | IJMatrix_parcsr.c:2846 | exec |
○ | __kmp_invoke_microtask | libiomp5.so | |
○ | __kmp_invoke_task_func | libiomp5.so | |
►18.18+ | hypre_ParTMatmul | par_csr_matop.c:3401 | exec |
○ | hypre_BoomerAMGSetup | par_amg_setup.c:1227 | exec |
○ | hypre_PCGSetup | pcg.c:234 | exec |
○ | main | amg.c:398 | exec |
○ | __libc_start_main | libc-2.28.so | |
►9.09+ | hypre_IJMatrixAssembleParCSR.e[...] | IJMatrix_parcsr.c:2846 | exec |
○ | __kmp_invoke_microtask | libiomp5.so | |
○ | __kmp_invoke_task_func | libiomp5.so | |
►9.09+ | hypre_IJMatrixAssembleParCSR.e[...] | IJMatrix_parcsr.c:2846 | exec |
○ | __kmp_invoke_microtask | libiomp5.so | |
○ | __kmp_invoke_task_func | libiomp5.so | |
►9.09+ | hypre_IJMatrixAssembleParCSR.e[...] | IJMatrix_parcsr.c:2846 | exec |
○ | __kmp_invoke_microtask | libiomp5.so | |
○ | __kmp_invoke_task_func | libiomp5.so | |
►9.09+ | hypre_IJMatrixAssembleParCSR.e[...] | IJMatrix_parcsr.c:2846 | exec |
○ | __kmp_invoke_microtask | libiomp5.so | |
○ | __kmp_invoke_task_func | libiomp5.so | |
►4.55+ | hypre_IJMatrixAssembleParCSR.e[...] | IJMatrix_parcsr.c:2846 | exec |
○ | __kmp_invoke_microtask | libiomp5.so | |
○ | __kmp_invoke_task_func | libiomp5.so | |
►4.55+ | __kmp_invoke_microtask | libiomp5.so | |
○ | __kmp_invoke_task_func | libiomp5.so | |
►4.54+ | hypre_BoomerAMGBuildMultipass | par_multi_interp.c:843 | exec |
○ | hypre_BoomerAMGSetup | par_amg_setup.c:737 | exec |
○ | hypre_PCGSetup | pcg.c:234 | exec |
○ | main | amg.c:398 | exec |
○ | __libc_start_main | libc-2.28.so | |
►4.54+ | hypre_IJMatrixAssembleParCSR.e[...] | IJMatrix_parcsr.c:2846 | exec |
○ | __kmp_invoke_microtask | libiomp5.so | |
○ | __kmp_invoke_task_func | libiomp5.so | |
►4.54+ | hypre_IJMatrixAssembleParCSR.e[...] | IJMatrix_parcsr.c:2846 | exec |
○ | __kmp_invoke_microtask | libiomp5.so | |
○ | __kmp_invoke_task_func | libiomp5.so |
Path / |
Source file and lines | binsearch.c:29-53 |
Module | exec |
nb instructions | 17 |
nb uops | 17 |
loop length | 50 |
used x86 registers | 6 |
used mmx registers | 0 |
used xmm registers | 0 |
used ymm registers | 0 |
used zmm registers | 0 |
nb stack references | 0 |
micro-operation queue | 2.83 cycles |
front end | 2.83 cycles |
P0 | P1 | P2 | P3 | P4 | P5 | P6 | P7 | P8 | P9 | P10 | P11 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
uops | 1.50 | 0.80 | 1.33 | 1.33 | 0.50 | 0.60 | 1.50 | 0.50 | 0.50 | 0.50 | 0.60 | 1.33 |
cycles | 1.50 | 0.80 | 1.33 | 1.33 | 0.50 | 0.60 | 1.50 | 0.50 | 0.50 | 0.50 | 0.60 | 1.33 |
Cycles executing div or sqrt instructions | NA |
FE+BE cycles | 2.93-2.95 |
Stall cycles | 0.00 |
Front-end | 2.83 |
Dispatch | 1.50 |
Overall L1 | 2.83 |
all | 0% |
load | NA (no load vectorizable/vectorized instructions) |
store | NA (no store vectorizable/vectorized instructions) |
mul | NA (no mul vectorizable/vectorized instructions) |
add-sub | NA (no add-sub vectorizable/vectorized instructions) |
fma | NA (no fma vectorizable/vectorized instructions) |
div/sqrt | NA (no div/sqrt vectorizable/vectorized instructions) |
other | 0% |
all | 10% |
load | NA (no load vectorizable/vectorized instructions) |
store | NA (no store vectorizable/vectorized instructions) |
mul | NA (no mul vectorizable/vectorized instructions) |
add-sub | NA (no add-sub vectorizable/vectorized instructions) |
fma | NA (no fma vectorizable/vectorized instructions) |
div/sqrt | NA (no div/sqrt vectorizable/vectorized instructions) |
other | 10% |
Instruction | Nb FU | P0 | P1 | P2 | P3 | P4 | P5 | P6 | P7 | P8 | P9 | P10 | P11 | Latency | Recip. throughput |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
PUSH %RBP | 1 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0.50 | 0.50 | 0.50 | 0 | 0 | 5-12 | 0.50 |
MOV %RSP,%RBP | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.17 |
MOV $-0x1,%RAX | 1 | 0.20 | 0.20 | 0 | 0 | 0 | 0.20 | 0.20 | 0 | 0 | 0 | 0.20 | 0 | 1 | 0.20 |
TEST %RDX,%RDX | 1 | 0.20 | 0.20 | 0 | 0 | 0 | 0.20 | 0.20 | 0 | 0 | 0 | 0.20 | 0 | 2 | 0.20 |
JLE 4e51c1 <hypre_BinarySearch+0x81> | 1 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 0.50 |
DEC %RDX | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.17 |
XOR %ECX,%ECX | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.17 |
JMP 4e516b <hypre_BinarySearch+0x2b> | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 5.84 |
NOPW (%RAX,%RAX,1) | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.17 |
JMP 4e51c1 <hypre_BinarySearch+0x81> | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 5.84 |
NOP | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.17 |
POP %RBP | 1 | 0 | 0 | 0.33 | 0.33 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.33 | 1-6 | 0.33 |
RET | 1 | 0.50 | 0 | 0.33 | 0.33 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0.33 | 0 | 2.13 |
MOV %R8,%RAX | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.17 |
POP %RBP | 1 | 0 | 0 | 0.33 | 0.33 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.33 | 1-6 | 0.33 |
RET | 1 | 0.50 | 0 | 0.33 | 0.33 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0.33 | 0 | 2.13 |
NOPL (%RAX,%RAX,1) | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.17 |
Source file and lines | binsearch.c:29-53 |
Module | exec |
nb instructions | 17 |
nb uops | 17 |
loop length | 50 |
used x86 registers | 6 |
used mmx registers | 0 |
used xmm registers | 0 |
used ymm registers | 0 |
used zmm registers | 0 |
nb stack references | 0 |
micro-operation queue | 2.83 cycles |
front end | 2.83 cycles |
P0 | P1 | P2 | P3 | P4 | P5 | P6 | P7 | P8 | P9 | P10 | P11 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
uops | 1.50 | 0.80 | 1.33 | 1.33 | 0.50 | 0.60 | 1.50 | 0.50 | 0.50 | 0.50 | 0.60 | 1.33 |
cycles | 1.50 | 0.80 | 1.33 | 1.33 | 0.50 | 0.60 | 1.50 | 0.50 | 0.50 | 0.50 | 0.60 | 1.33 |
Cycles executing div or sqrt instructions | NA |
FE+BE cycles | 2.93-2.95 |
Stall cycles | 0.00 |
Front-end | 2.83 |
Dispatch | 1.50 |
Overall L1 | 2.83 |
all | 0% |
load | NA (no load vectorizable/vectorized instructions) |
store | NA (no store vectorizable/vectorized instructions) |
mul | NA (no mul vectorizable/vectorized instructions) |
add-sub | NA (no add-sub vectorizable/vectorized instructions) |
fma | NA (no fma vectorizable/vectorized instructions) |
div/sqrt | NA (no div/sqrt vectorizable/vectorized instructions) |
other | 0% |
all | 10% |
load | NA (no load vectorizable/vectorized instructions) |
store | NA (no store vectorizable/vectorized instructions) |
mul | NA (no mul vectorizable/vectorized instructions) |
add-sub | NA (no add-sub vectorizable/vectorized instructions) |
fma | NA (no fma vectorizable/vectorized instructions) |
div/sqrt | NA (no div/sqrt vectorizable/vectorized instructions) |
other | 10% |
Instruction | Nb FU | P0 | P1 | P2 | P3 | P4 | P5 | P6 | P7 | P8 | P9 | P10 | P11 | Latency | Recip. throughput |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
PUSH %RBP | 1 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0.50 | 0.50 | 0.50 | 0 | 0 | 5-12 | 0.50 |
MOV %RSP,%RBP | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.17 |
MOV $-0x1,%RAX | 1 | 0.20 | 0.20 | 0 | 0 | 0 | 0.20 | 0.20 | 0 | 0 | 0 | 0.20 | 0 | 1 | 0.20 |
TEST %RDX,%RDX | 1 | 0.20 | 0.20 | 0 | 0 | 0 | 0.20 | 0.20 | 0 | 0 | 0 | 0.20 | 0 | 2 | 0.20 |
JLE 4e51c1 <hypre_BinarySearch+0x81> | 1 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0 | 0 | 0.50 |
DEC %RDX | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.17 |
XOR %ECX,%ECX | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.17 |
JMP 4e516b <hypre_BinarySearch+0x2b> | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 5.84 |
NOPW (%RAX,%RAX,1) | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.17 |
JMP 4e51c1 <hypre_BinarySearch+0x81> | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 5.84 |
NOP | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.17 |
POP %RBP | 1 | 0 | 0 | 0.33 | 0.33 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.33 | 1-6 | 0.33 |
RET | 1 | 0.50 | 0 | 0.33 | 0.33 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0.33 | 0 | 2.13 |
MOV %R8,%RAX | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0.17 |
POP %RBP | 1 | 0 | 0 | 0.33 | 0.33 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.33 | 1-6 | 0.33 |
RET | 1 | 0.50 | 0 | 0.33 | 0.33 | 0 | 0 | 0.50 | 0 | 0 | 0 | 0 | 0.33 | 0 | 2.13 |
NOPL (%RAX,%RAX,1) | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.17 |
Name | Coverage (%) | Time (s) |
---|---|---|
▼hypre_BinarySearch– | 0.01 | 0 |
○Loop 4422 - binsearch.c:35-44 - exec | 0 | 0.01 |