OV - Compare Loops

MAQAO

options

Loops

▶main.cpp: 117 - 130.36 %

ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s
Run Neoverse V1 ACFL Ofast (base)							Run Neoverse V1 ACFL Ofast SoA							Run Neoverse V1 ACFL Ofast Manual Unroll
Loop Source Regions	/home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 117-123						Loop Source Regions							Loop Source Regions	/home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 118-131
7	1.74	1.47	70.23	41.67	37.5	17.02								8	1.16	1.02	60.13	35.71	38.39	10.73

Sum on 1 analyzed binary loop (kmeans-acfl-Ofast - 7)							No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.							Sum on 1 analyzed binary loop (kmeans-acfl-Ofast - 8)
Analysis						Count	Analysis						Count	Analysis						Count
Loop Computation Issues														Loop Computation Issues
Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA						1								Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA						1
Control Flow Issues														Control Flow Issues
Presence of 2 to 4 paths						1								Presence of 2 to 4 paths
Vectorization Roadblocks														Vectorization Roadblocks
Presence of 2 to 4 paths						1								Presence of 2 to 4 paths

▶main.cpp: 71 - 66.10 %

ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s
Run Neoverse V1 ACFL Ofast (base)							Run Neoverse V1 ACFL Ofast SoA							Run Neoverse V1 ACFL Ofast Manual Unroll
Loop Source Regions							Loop Source Regions	/home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 71-76						Loop Source Regions
							8	1.51	1.28	66.10	12.5	26.56	8.73

No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.							Sum on 1 analyzed binary loop (kmeans-acfl-Ofast - 8)							No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.
Analysis						Count	Analysis						Count	Analysis						Count

▶main.cpp: 156 - 8.57 %

ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s
Run Neoverse V1 ACFL Ofast (base)							Run Neoverse V1 ACFL Ofast SoA							Run Neoverse V1 ACFL Ofast Manual Unroll
Loop Source Regions							Loop Source Regions							Loop Source Regions	/home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 156-160
														39	0.20	0.15	8.57	0	20.83	1.51

No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.							No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.							Sum on 1 analyzed binary loop (kmeans-acfl-Ofast - 39)
Analysis						Count	Analysis						Count	Analysis						Count
														Loop Computation Issues
														Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA						1
														Presence of a large number of scalar integer instructions						1
														Data Access Issues
														Presence of indirect access						1
														Vectorization Roadblocks
														Presence of indirect access						1

▶main.cpp: 140 - 7.87 %

ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s
Run Neoverse V1 ACFL Ofast (base)							Run Neoverse V1 ACFL Ofast SoA							Run Neoverse V1 ACFL Ofast Manual Unroll
Loop Source Regions	/home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 140-144						Loop Source Regions							Loop Source Regions
25	0.19	0.17	7.87	0	20.83	1.2

Sum on 1 analyzed binary loop (kmeans-acfl-Ofast - 25)							No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.							No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.
Analysis						Count	Analysis						Count	Analysis						Count
Loop Computation Issues
Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA						1
Presence of a large number of scalar integer instructions						1
Data Access Issues
Presence of indirect access						1
Vectorization Roadblocks
Presence of indirect access						1

▶main.cpp: 93 - 6.34 %

ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s
Run Neoverse V1 ACFL Ofast (base)							Run Neoverse V1 ACFL Ofast SoA							Run Neoverse V1 ACFL Ofast Manual Unroll
Loop Source Regions							Loop Source Regions	/home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 93-97						Loop Source Regions
							39	0.19	0.12	6.34	0	20.83	1.47

No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.							Sum on 1 analyzed binary loop (kmeans-acfl-Ofast - 39)							No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.
Analysis						Count	Analysis						Count	Analysis						Count
							Loop Computation Issues
							Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA						1
							Presence of a large number of scalar integer instructions						1
							Data Access Issues
							Presence of indirect access						1
							Vectorization Roadblocks
							Presence of indirect access						1

×