By Jennifer White & Jeff Bier, 5/11/2004
Last month Stretch Inc. announced the
availability of an unusual new processor chip family, the S5000. This
family pairs a 300 MHz Tensilica Xtensa RISC processor core with a 100
MHz reconfigurable compute fabric. The reconfigurable fabric, referred
to as the Instruction Set Extension Fabric (ISEF), allows a set of
custom instructions to be added to the RISC processor's instruction set
at run-time via software. The ISEF is logically separated into two
sections, an organization that allows one section to be reconfigured
with a new set of instructions while the other section is operating
normally. (Reconfiguring requires 80-100 microseconds.)
The ISEF targets the compute-intensive
portions of signal processing applications where it should provide
significant performance gains. According to Stretch, the ISEF can
implement instructions that collapse tens or hundreds of compute
operations into just one instruction. For example, the ISEF can
implement the sum of absolute differences function found in H.263
motion estimation. ISEF instructions execute at a 100 MHz rate
(one-third the instruction rate of the Xtensa core) and may have
latencies of 10 or more Xtensa clock cycles.
Using its full capacity, the ISEF can theoretically support an
instruction that requires 400 16-bit ALUs and 64 16-bit multipliers
operating in parallel, although it is unlikely that full utilization of
the ISEF will be achieved in typical applications. The ISEF can support
a large number of custom instructions simultaneously, but Stretch
believes that typical applications will require only a handful of
instructions to achieve significant performance gains.
Many previous attempts to commercialize reconfigurable
processors have failed primarily due to the considerable effort
required to implement, optimize, and verify applications on such
architectures. Stretch aims to minimize this hurdle through automatic
configuration of the ISEF based on C/C++ application source code.
According to Stretch, custom instructions are automatically
incorporated into the processor through compiler analysis of
compute-intensive portions of C/C++ application software that have been
flagged by the user. This automated implementation of custom
instructions promises to dramatically reduce application development
time compared with ASICs and FPGA-based solutions.
Further simplifying matters, Stretch has not invented a
complicated new programming model: ISEF instructions are issued by the
Xtensa core just like built-in RISC instructions, and they access their
own dedicated register file. Stretch is also able to leverage the
relatively mature compiler and development infrastructure of the
Tensilica architecture.
Based on performance estimates from Stretch, the S5000 family
promises to be significantly faster than today's fastest mainstream
DSPs—at least on some applications. Many signal processing applications
exhibit significant data parallelism and make heavy use of specialized
operations. BDTI expects that the Stretch architecture will be a good
match for such applications. Although the S5000 family will not match
the throughput of ASICs or large FPGAs, it may well find homes in many
applications where the potential for rapid development outweighs the
need for absolute maximum performance.
|