You are here:  Articles


 
 
 
INSIDE DSP ARTICLES  

Current Articles | Categories | Search

Evaluating the DSP Capabilities of the Cortex-R4
By BDTI, 12/6/2007

To view this article as a pdf click here.

In 2004, ARM announced its newest generation of licensable cores, called the “Cortex” family.  Cortex cores span a wide range of performance levels, with Cortex M-series cores at the low end, Cortex R-series cores providing mid-range performance, and the Cortex A-series applications processors offering the highest performance.  The first Cortex core to be announced was the Cortex-M3, and since then ARM has announced several others, including the Cortex-A8 and A9, the Cortex-M1, and the Cortex-R4.

The Cortex-R4 targets moderately demanding applications such as hard disk drives, inkjet printers, automotive safety systems, and wireless modems. It is marketed as a higher-performance replacement for the older ARM9E core. BDTI recently completed a benchmark analysis of the ARM Cortex-R4 core and is now releasing the first independent signal processing benchmark results for this processor. In this article, we’ll take a look at its benchmark results and compare its performance to that of other ARM cores (including the ARM11, another moderate-performance core) and selected competitors.

Table 1 summarizes key attributes of selected ARM processor cores.

 

ARM9E

ARM11

Cortex-R4

Cortex-A8 w/NEON*

Typical clock rate*

265 MHz

(130 nm)

335 MHz

(130 nm)

375 MHz

(90 nm)

450 MHz–1100 MHz

(65 nm)

Instruction sets

ARMv5E,

Thumb

ARMv6,

Thumb, Thumb2

ARMv7,

Thumb,
Thumb2

ARMv7,

Thumb,
Thumb2, NEON

Issue width

Single issue

Single issue

Dual issue
(superscalar)

Dual issue (superscalar)

Pipeline stages

5

8

8

13 + 10 (NEON)

DSP/media  instructions

Minor

Minor

Minor

Extensive (NEON)

Per-cycle multiply-accumulate throughput (fixed-point)

1 × 32-bit

1 × 16-bit

1 × 32-bit

2 × 16-bit

1 × 32-bit

2 × 16-bit

 

2 × 32-bit

4 × 16-bit

8 × 8-bit

Float: 2 × 32-bit

Data bus

32-bit

64-bit

64-bit

64-/128-bit

Branch prediction

No

Yes

Yes

Yes

Table 1. Characteristics of selected ARM cores.

*Clock speed data provided by ARM, not verified by BDTI. Clock speeds for ARM9E and ARM11 are worst-case speeds in a TSMC CL013G process and ARM Artisan SAGE-X library. Clock speed for Cortex-R4 is worst-case for a 90 nm CLN90G Artisan Advantage implementation. High-end clock speed for Cortex-A8 is based on a custom implementation.

As shown in Table 1, the Cortex-R4 is a superscalar core that can issue and execute up to two instructions per cycle. Like the Cortex-A8, it supports the ARMv7 instruction set architecture and the Thumb2 compressed instruction set, but the Cortex-R4 does not support the NEON signal processing extensions.  As a result, its signal processing capabilities and features are much more limited than those of the Cortex-A8.

Previous Page | Next Page
 
 
Fast Tracking
DSPDesignLine
  
HomeAbout Inside DSPArticlesSearch ArticlesArchivesResourcesContact UsSubscribe to Inside DSPAdvertise with Inside DSP
Copyright 2006-2008 by BDTI  |  Terms Of Use  |  Privacy Statement
  |