As the name suggests, the ‘C64x+ is based on TI's well-established ‘C64x DSP architecture. The ‘C64x+ is object-code compatible with its
predecessor, and in most respects it is similar to its predecessor. For
example, both architectures can execute up to eight instructions per
cycle. And like current ‘C64x-based chips, the new ‘C6455 will operate
at up to1 GHz. However, the ‘C64x+ includes some important upgrades
that significantly improve both the throughput and the
memory-efficiency of the new architecture.
The most prominent upgrade is increased multiply-accumulate (MAC)
throughput. The ‘C64x+ can perform up to eight 16-bit MAC operations
per cycle, compared to a maximum of four MAC operations per cycle on
the ‘C64x. The ‘C64x+ is also able to complete up to two 32 x 32 MAC
operations per cycle. In contrast, the ‘C64x does not directly support
32 x 32 MAC operations. The ‘C64x+ also offers expanded add and
subtract capabilities, as well as new bit-manipulation instructions
that accelerate security and communications algorithms.
Interestingly, the ‘C64x+ adds no video-specific instructions. This is
striking because video applications are a key target for the new
architecture. Although the ‘C64x+ will have respectable
video-processing capabilities, it could have benefited from additional
video-oriented instructions.
Moving beyond new instructions, the ‘C64x+ also takes a new approach to
software-pipelined loops—which are used heavily in optimized ‘C64x code
to reduce the impact of the deep pipeline. The ‘C64x+ adds a loop
buffer that greatly reduces the need for loop setup and cleanup code.
The obvious benefit of this change is that it reduces code size in
loop-intensive signal-processing code. The loop buffer also allows the
programmer to schedule instructions that execute only once in parallel
with loop instructions. This feature makes use of execution slots that
would otherwise go unused, significantly improving performance in some
cases.
Although the loop buffer brings important benefits, it requires a style
of programming that many assembly-level programmers will find
unfamiliar and challenging. This is particularly problematic because
the ‘C64x was already a challenging assembly-code target.
Last but not least, the ‘C64x+ supports 16-bit wide instruction words
as well as the 32-bit instructions used by the ‘C64x. The use of
mixed-width instruction sets is a common memory-saving feature, but the ‘C64x+ takes an unusual approach to implementing this feature. Due to
this unusual approach, the programmer cannot specify which instructions
use 16-bit encoding. Instead, the assembler determines where it can use
16-bit encoding. It is difficult to tell where 16-bit instructions will
be used, making it difficult for assembly-level programmers to minimize
memory use. The upside of TI's approach is that re-assembling ‘C64x
code for the ‘C64x+ will usually provide significant memory savings.
BDTI recently completed an analysis of the ‘C64x+ using its BDTI
Benchmarks. Based on the results of this analysis, the combination of
new instructions and the loop buffer give the ‘C64x+ a 20% performance
boost over its predecessor. On some algorithms, the ‘C64x+ also uses
roughly half as much program memory as the ‘C64x. (An analysis of both
program and data memory use shows that the ‘C64x+ uses about 15% less
memory than its predecessor overall). Benchmark results for the ‘C64x
and ‘C64x+ are available at http://www.BDTI.com/bdtimark/BDTImark2000.htm.
Overall, the ‘C64x+ is a significant, if not revolutionary, improvement
over the ‘C64x. By improving both speed and memory use, TI is sure to
strengthen its lead in high-performance DSP. The main challenge for TI
will be helping its customers deal with the increased complexity in
what was already a highly complicated architecture.
The ‘C6455 is expected to begin sampling in the third quarter of 2005.
Volume production is scheduled for the second quarter of 2006. Planned
pricing for 10,000-unit orders is $259 for the 1 GHz version, $219 for
the 850 MHz version, and $179 for the 720 MHz version.