Bottom     Previous     Contents

Annex D - When does FFT Convolution/Correlation Pay?

Although it is widely known that FFT's provide a fast way to implement long convolutions or correlations (a correlation is just a convolution with the impulse response reversed and conjugated), using them requires longer and more complex code. So the question arises when is it worth it? Just how long does a convolution have to be before using FFT's provides significant savings? For some reason, this issue in not addressed in most DSP/FFT literature, despite its importance to a DSP system designer. To get a feel for the answer to this we shall consider the implementation of a complex 1D FIR filter on an 'infinite' 1D complex data stream by two methods. Direct convolution and Radix 2 FFT convolution using 50% overlap between FFT's.

We shall assume that the filter contains 2p+1 taps (its impulse response is 2p+1 samples long). This length has been chosen to ensure that when convolved with 2p signal points, the resulting sequence has length (2p+1)+(2p)-1=2p+1, which fits a Radix 2 FFT comfortably.

In each case, we shall quantify the algorithm in terms of the number of complex additions NA(p) and the number of complex multiplies NM(p) required to generate 2p output points.

Direct Convolution.

Direct convolution can be performed point by point or in blocks. In either case each output point will require 2p+1 complex multiplies and 2p complex additions. Therefore:

equation

equation

FFT Convolution.

This scheme works as follows:

  1. Divide the input stream into adjacent blocks of 2p points.
  2. Perform a 2p+1 point DIF FFT of each block, which has been padded by 2p zeros.
  3. Multiply by the 2p+1 point FFT of the impulse response (a constant look up table which may include scaling).
  4. Perform a 2p+1 point IFFT of the resulting block, generating 2p+1 output points.
  5. Add the first 2p output points to the last 2p output points of the previous convolved block. (This data is the filter output.) Store the last 2p output points for the next convolved block.

The following optimisations are assumed:

  1. Each FFT/IFFT makes full use of the 'trivial' butterflies, thereby avoiding unnecessary complex multiplies.
  2. The forward (DIF) FFT can avoid 2p complex additions in the first pass, because the 'bottom' half of the array is zero.

We may note that the saving presented optimisation 2 above effectively cancels out the cost of step 5, so we are left with two 2p+1 point FFT's (or IFFT's) and 2p+1 complex multiplies. We have derived the number of butterflies required by a 2p Radix 2 FFT:

equation

The number of 'trivial' butterflies (which require no multiplication) is (p>2 !!):

equation

Each (non trivial) butterfly requires 2 complex additions and 1 complex multiply.So, for a single 2p+1 point FFT (p>1 !!):

equation

equation

So for the entire algorithm (two 2p+1 point FFT's (or IFFT's) and 2p+1 complex multiplies):

equation

equation

Comparing of the Results.

To make the comparison simpler, we shall assume that real multiplies and adds take the same time (as they often do), and combine the figures for NA(p) and NM(p) as NOP(p) operations on real numbers as follows:

equation

So for direct convolution we have:

equation

equation

equation

For FFT convolution we have:

equation

equation

equation

The ratio of these two is what we really want..

equation

This tells us (roughly) how much slower direct convolution (or correlation) is compared to FFT convolution. Let's tabulate a few values:

p FIR Taps (=2p+1) (Direct/FFT) Ratio
3 9 1.2
4 17 1.7
5 33 2.7
6 65 4.5
7 129 7.6
8 257 13.2
9 513 23.3
10 1025 41.8

For large p, a good approximation is:

equation

So it would appear that even for filters as short as 17 taps FFT based convolution looks attractive (in terms of the number of arithmetic operations). However, a few notes of caution are appropriate:

As usual, if you really want to know which method is fastest for your particular application on your particular processor, the best advice is to try them both.

©1999 - Engineering Productivity Tools Ltd.


Next     Top