OV - Compare Loops

MAQAO

options

Loops

▶main.cpp: 71 - 66.84 %

ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s
Run Neoverse V1 ACFL Ofast Base (250 iterations, 64 threads)							Run Neoverse V1 ACFL Ofast SoA (250 iterations, 64 threads)							Run Neoverse V1 ACFL Ofast Manual Unroll (250 iterations, 64 threads)							Run Neoverse V1 ACFL Ofast Manual Unroll + SoA (250 iterations, 64 threads)
Loop Source Regions							Loop Source Regions	/home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 71-76						Loop Source Regions							Loop Source Regions
							8	14.99	12.69	66.84	12.5	26.56	1.87

No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.							Sum on 1 analyzed binary loop (kmeans-acfl-Ofast - 8)							No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.							No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.
Analysis						Count	Analysis						Count	Analysis						Count	Analysis						Count

▶main.cpp: 74 - 66.47 %

ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s
Run Neoverse V1 ACFL Ofast Base (250 iterations, 64 threads)							Run Neoverse V1 ACFL Ofast SoA (250 iterations, 64 threads)							Run Neoverse V1 ACFL Ofast Manual Unroll (250 iterations, 64 threads)							Run Neoverse V1 ACFL Ofast Manual Unroll + SoA (250 iterations, 64 threads)
Loop Source Regions							Loop Source Regions							Loop Source Regions							Loop Source Regions	/home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 74-87
																					8	12.64	11.22	66.47	11.76	25.74	2.09

No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.							No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.							No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.							Sum on 1 analyzed binary loop (kmeans-acfl-Ofast - 8)
Analysis						Count	Analysis						Count	Analysis						Count	Analysis						Count

▶main.cpp: 118 - 62.81 %

ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s
Run Neoverse V1 ACFL Ofast Base (250 iterations, 64 threads)							Run Neoverse V1 ACFL Ofast SoA (250 iterations, 64 threads)							Run Neoverse V1 ACFL Ofast Manual Unroll (250 iterations, 64 threads)							Run Neoverse V1 ACFL Ofast Manual Unroll + SoA (250 iterations, 64 threads)
Loop Source Regions							Loop Source Regions							Loop Source Regions	/home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 118-131						Loop Source Regions
														8	11.81	10.19	62.81	35.71	38.39	2.28

No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.							No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.							Sum on 1 analyzed binary loop (kmeans-acfl-Ofast - 8)							No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.
Analysis						Count	Analysis						Count	Analysis						Count	Analysis						Count
														Loop Computation Issues
														Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA						1

▶main.cpp: 156 - 7.62 %

ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s
Run Neoverse V1 ACFL Ofast Base (250 iterations, 64 threads)							Run Neoverse V1 ACFL Ofast SoA (250 iterations, 64 threads)							Run Neoverse V1 ACFL Ofast Manual Unroll (250 iterations, 64 threads)							Run Neoverse V1 ACFL Ofast Manual Unroll + SoA (250 iterations, 64 threads)
Loop Source Regions							Loop Source Regions							Loop Source Regions	/home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 156-160						Loop Source Regions
														39	1.71	1.24	7.62	0	20.83	2.03

No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.							No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.							Sum on 1 analyzed binary loop (kmeans-acfl-Ofast - 39)							No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.
Analysis						Count	Analysis						Count	Analysis						Count	Analysis						Count
														Loop Computation Issues
														Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA						1
														Presence of a large number of scalar integer instructions						1
														Data Access Issues
														Presence of indirect access						1
														Vectorization Roadblocks
														Presence of indirect access						1

▶main.cpp: 112 - 6.59 %

ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s
Run Neoverse V1 ACFL Ofast Base (250 iterations, 64 threads)							Run Neoverse V1 ACFL Ofast SoA (250 iterations, 64 threads)							Run Neoverse V1 ACFL Ofast Manual Unroll (250 iterations, 64 threads)							Run Neoverse V1 ACFL Ofast Manual Unroll + SoA (250 iterations, 64 threads)
Loop Source Regions							Loop Source Regions							Loop Source Regions							Loop Source Regions	/home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 112-116
																					38	1.50	1.11	6.59	0	20.83	3.27

No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.							No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.							No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.							Sum on 1 analyzed binary loop (kmeans-acfl-Ofast - 38)
Analysis						Count	Analysis						Count	Analysis						Count	Analysis						Count
																					Loop Computation Issues
																					Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA						1
																					Presence of a large number of scalar integer instructions						1
																					Data Access Issues
																					Presence of indirect access						1
																					Vectorization Roadblocks
																					Presence of indirect access						1

▶main.cpp: 93 - 6.39 %

ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s
Run Neoverse V1 ACFL Ofast Base (250 iterations, 64 threads)							Run Neoverse V1 ACFL Ofast SoA (250 iterations, 64 threads)							Run Neoverse V1 ACFL Ofast Manual Unroll (250 iterations, 64 threads)							Run Neoverse V1 ACFL Ofast Manual Unroll + SoA (250 iterations, 64 threads)
Loop Source Regions							Loop Source Regions	/home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 93-97						Loop Source Regions							Loop Source Regions
							39	1.77	1.21	6.39	0	20.83	1.31

No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.							Sum on 1 analyzed binary loop (kmeans-acfl-Ofast - 39)							No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.							No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.
Analysis						Count	Analysis						Count	Analysis						Count	Analysis						Count
							Loop Computation Issues
							Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA						1
							Presence of a large number of scalar integer instructions						1
							Data Access Issues
							Presence of indirect access						1
							Vectorization Roadblocks
							Presence of indirect access						1

▶main.cpp: 140 - 6.09 %

ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s
Run Neoverse V1 ACFL Ofast Base (250 iterations, 64 threads)							Run Neoverse V1 ACFL Ofast SoA (250 iterations, 64 threads)							Run Neoverse V1 ACFL Ofast Manual Unroll (250 iterations, 64 threads)							Run Neoverse V1 ACFL Ofast Manual Unroll + SoA (250 iterations, 64 threads)
Loop Source Regions	/home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 140-144						Loop Source Regions							Loop Source Regions							Loop Source Regions
38	1.75	1.18	6.09	0	20.83	1.53

Sum on 1 analyzed binary loop (kmeans-acfl-Ofast - 38)							No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.							No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.							No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.
Analysis						Count	Analysis						Count	Analysis						Count	Analysis						Count
Loop Computation Issues
Less than 10% of the FP ADD/SUB/MUL arithmetic operations are performed using FMA						1
Presence of a large number of scalar integer instructions						1
Data Access Issues
Presence of indirect access						1
Vectorization Roadblocks
Presence of indirect access						1

▶main.cpp: 115 - 0.27 %

ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s
Run Neoverse V1 ACFL Ofast Base (250 iterations, 64 threads)							Run Neoverse V1 ACFL Ofast SoA (250 iterations, 64 threads)							Run Neoverse V1 ACFL Ofast Manual Unroll (250 iterations, 64 threads)							Run Neoverse V1 ACFL Ofast Manual Unroll + SoA (250 iterations, 64 threads)
Loop Source Regions	/home/fmusial/KMEANS_Benchmarks/kmeans/main.cpp: 115-117						Loop Source Regions							Loop Source Regions							Loop Source Regions
7	0.13	0.05	0.27	0	18.75	1.65

Sum on 1 analyzed binary loop (kmeans-acfl-Ofast - 7)							No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.							No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.							No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.
Analysis						Count	Analysis						Count	Analysis						Count	Analysis						Count
Control Flow Issues
Vectorization Roadblocks
Presence of more than 4 paths						1

▶<unknown>: 0 - 0.00 %

ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s	ASM Loop ID	Max Time Over Threads (s)	Time w.r.t. Wall Time (s)	Cov (%)	Vect. Ratio (%)	Vector Length Use (%)	GFLOP/s
Run Neoverse V1 ACFL Ofast Base (250 iterations, 64 threads)							Run Neoverse V1 ACFL Ofast SoA (250 iterations, 64 threads)							Run Neoverse V1 ACFL Ofast Manual Unroll (250 iterations, 64 threads)							Run Neoverse V1 ACFL Ofast Manual Unroll + SoA (250 iterations, 64 threads)
Loop Source Regions							Loop Source Regions							Loop Source Regions							Loop Source Regions
							16	0.00	0.00	0.00	0	0	0	16	0.00	0.00	0.00	0	0	0	9	0.00	0.00	0.00	0	0	0

No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.							No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.							No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.							No Optimizer analysis found for any assembly loop. More loops can be analyzed using option --optimizer-loop-count.
Analysis						Count	Analysis						Count	Analysis						Count	Analysis						Count

×