Albert 的筆記本: 轉載 ARM compiler shoot-out for ffmpeg

2013年3月12日星期二

轉載 ARM compiler shoot-out for ffmpeg

FFMPEG 的編譯參數，與各種編譯器編譯後的效能比較
轉載自 http://hardwarebug.org/2009/08/05/arm-compiler-shoot-out/

A proper comparison of different compilers targeting ARM is long overdue, so I decided to do my part. I compiledFFmpeg using a selection of compilers, and measured the speed of the result when decoding various media samples. Since we are testing compilers, I disabled all hand-written assembler. The tests were run on a Beagle board clocked at 600 MHz.

These are the compilers I deemed worthy to participate in the test and the optimisation flags I used with each:

GCC 4.3.3, -mfpu=neon -mfloat-abi=softfp -mcpu=cortex-a8 -std=c99 -fomit-frame-pointer -O3 -fno-math-errno -fno-signed-zeros -fno-tree-vectorize
GCC 4.4.1, -mfpu=neon -mfloat-abi=softfp -mcpu=cortex-a8 -std=c99 -fomit-frame-pointer -O3 -fno-math-errno -fno-signed-zeros -fno-tree-vectorize
CodeSourcery GCC 2007q3 (based on 4.2.1), -mfpu=neon -mfloat-abi=softfp -mcpu=cortex-a8 -std=c99 -fomit-frame-pointer -O3 -fno-math-errno -fno-tree-vectorize
CodeSourcery GCC 2009q1 (based on 4.3.3), -mfpu=neon -mfloat-abi=softfp -mcpu=cortex-a8 -std=c99 -fomit-frame-pointer -O3 -fno-math-errno -fno-signed-zeros -fno-tree-vectorize
ARM RVCT 4.0 Build 591, -mfpu=neon -mfloat-abi=softfp -mcpu=cortex-a8 -std=c99 -fomit-frame-pointer -O3 -fno-math-errno -fno-signed-zeros

I would have also included the ARM compiler from Texas Instruments, had it been able to compile FFmpeg.

With sample files chosen to exercise various types of code, the result of the test is, sadly, no surprise. The following table lists the runtimes of the different builds relative to the CodeSourcery 2007q3 build. Lower numbers are better.

Sample name	Codec	Code type	2009q1	4.3.3	4.4.1	RVCT
cathedral	H.264 CABAC	integer	0.97	1.02	1.09	0.93
NeroAVC	H.264 CABAC	integer	0.98	1.02	1.12	0.95
indiana_jones_4	H.264 CAVLC	integer	0.97	1.02	1.09	0.89
NeroRecodeSample	MPEG-4 ASP	integer	0.96	1.03	1.27	0.96
Silent_Light	MP3	64-bit integer	0.89	0.88	0.97	0.44
When_I_Grow_Up	FLAC	integer	0.98	0.98	0.93	0.86
Lumme-Badloop	Vorbis	float	1.03	1.03	1.02	0.97
Canyon	AC-3	float	1.02	1.02	0.99	0.90
lotr	DTS	float	1.02	1.02	1.00	1.03

Looking at the table, I make these observations:

CodeSourcery 2009q1 produces faster integer code, but slower floating-point code, than 2007q3.
GCC 4.4.1 produces much slower code than 4.3.3 in several cases, and is never significantly better.
CodeSourcery GCC generally beats FSF GCC.
ARM RVCT readily beats every GCC version. The MP3 figure is not a typo.

My recommendation for a free compiler is CodeSourcery 2009q1 unless your code makes heavy use of floating-point, in which case 2007q3 may give better results. If you prefer, for whatever reason, official GNU releases, 4.3.3 should be the version of choice. Avoid GCC 4.4.1; it is far too unpredictable.

Bootnotes

See also Mike’s test of x86 compilers.
Thanks to ARM for providing the RVCT compiler.
Thanks to TI for providing the Beagle board.

2013年3月12日 星期二

轉載 ARM compiler shoot-out for ffmpeg

Bootnotes

2013年3月12日星期二