Optimization
Audio software optimization is the process of reducing the hardware
resources required by the software while maintaining sufficient audio
quality. When output audio samples are not produced quickly enough, the
product's hard real-time constraints are not met and audible, annoying
artifacts (such as clicks and pops) result. Therefore, audio software
developers initially focus is on reducing the computational load to
ensure samples are produced in a timely manner.
Optimizing for memory space comes second; although it is important
because it reduces the system cost, it is less critical than meeting
real-time constraints. There is typically a tradeoff between speed and
size—reducing the computational load often increases the memory
requirements. For example, instead of calculating parameters in real
time, it may be faster to look up the required values using a
pre-calculated table that consumes memory space.
The first step in optimization is to analyze the code to find
the routines with the highest processor load and then improve them one
by one. The processing rates for audio software routines can typically
be separated into three categories, sometimes called I-rate, K-rate,
and S-rate. I-rate routines run infrequently, typically at power-up or
for a mode change. K-rate routines run more frequently and are often
the user interface and control functions. S-rate routines run the most
frequently and usually generate the actual audio output samples. S-rate
routines use the lion's share of the processor's time, so they should
be optimized first.
There are many methods for reducing the computational load of
audio software routines. High-level optimizations include techniques
such as substituting one algorithm for another or processing blocks of
samples at a time instead of a single sample at a time. Lower-level
optimizations include manually selecting and scheduling instructions
for key inner loops, and employing specialized processor hardware
features like DMA.
Reducing Memory Size and Bandwidth Needs
Reverb algorithms typically use more data memory (32 kB to 128 kB or
more), in the form of delay lines, than other audio post-processing
effects. Delay lines, which are implemented using a contiguous chunk of
memory, are a common building block for audio effects and are used to
delay a signal by a specific amount of time. One way to implement a
delay line is to use a first-in, first-out (FIFO) buffer implemented
using circular buffering. A write pointer points to the location in
memory where the next input sample is to be written into the delay
line, and a read pointer points to the next sample to be read from the
delay line. The number of samples between the two pointers defines the
delay time. When either of the pointers reaches the end of the buffer,
it "wraps around" to the beginning. Accessing memory is a common
bottleneck, so optimizations that reduce the number of memory accesses
and take advantage of specialized memory accessing hardware can produce
good results.
Reducing the amount of memory required is one motivation for choosing
an alternative algorithm. The idea is to find a different algorithm
that produces nearly the same output while using less memory. Reverb
algorithms have been an area of active research for many years, so the
literature is filled with options. In place of the Schroeder reverb we
might use an alternative algorithm that requires less memory and makes
fewer memory accesses. A few algorithms reduce the amount of data
memory sufficiently so that it may be possible to place all of the
delay lines in fast, on-chip memory, thereby nearly eliminating
memory-access bottlenecks. But the subjective nature of perceived audio
quality means that caution is advised: unless the memory reduced
algorithm is properly designed, it won't sound very good.