Why should I zero-pad a signal before taking the Fourier transform?
In an answer to a previous question, it was stated that one should
zero-pad the input signals (add zeros to the end so that at least half of the wave is "blank")
What's the reason for this?
It depends on what you're doing. This could have been a comment on my answer. I added some explanation to it.
@endolith: I initially thought to put it as a comment, but I think the question may be of general interest, and that it would be a pity if a good answer to it was buried in comments somewhere. If you disagree, I'll delete this question.
Well it's a very general question. You can zero pad to make something a power-of-2, you can zero pad to make circular transform behave like non-circular transform, you can do it to resample a signal, to change frequency resolution, etc. etc.
Zero padding allows one to use a longer FFT, which will produce a longer FFT result vector.
A longer FFT result has more frequency bins that are more closely spaced in frequency. But they will be essentially providing the same result as a high quality Sinc interpolation of a shorter non-zero-padded FFT of the original data.
This might result in a smoother looking spectrum when plotted without further interpolation.
Although this interpolation won't help with resolving or the resolution of and/or between adjacent or nearby frequencies, it might make it easier to visually resolve the peak of a single isolated frequency that does not have any significant adjacent signals or noise in the spectrum. Statistically, the higher density of FFT result bins will probably make it more likely that the peak magnitude bin is closer to the frequency of a random isolated input frequency sinusoid, and without further interpolation (parabolic, et.al.).
But, essentially, zero padding before a DFT/FFT is a computationally efficient method of interpolating a large number of points.
Zero-padding for cross-correlation, auto-correlation, or convolution filtering is used to not mix convolution results (due to circular convolution). The full result of a linear convolution is longer than either of the two input vectors. If you don't provide a place to put the end of this longer convolution result, FFT fast convolution will just mix it in with and cruft up your desired result. Zero-padding provides a bunch zeros into which to mix the longer result. And it's far far easier to un-mix something that has only been mixed/summed with a vector of zeros.
The last paragraph is the key answer to the original question, although I think it could be stated more clearly. Zero-padding in the context of correlation or convolution can be done to ensure that implementing the process in the frequency domain yields *linear* instead of *circular* convolution/correlation. However, doing so is not a requirement if you're willing to do some bookkeeping work on the side, like in the overlap-save and overlap-add algorithms.
@Jason R: Actually, they are both circular convolution. A normal (non-pruned) FFT does all the multiplies and adds for the wrap around part of the result. It's just that in the sufficiently zero-padded case, all those multiplies and adds are of the value zero, so nobody cares about the nothing that is computed and wrapped around the circle.
Indeed; multiplication of the DFTs of two signals does always implement circular convolution. I should have worded it differently: you stuff zeros at the end of one signal to ensure that the result obtained by circularly convolving them is the same as what you get if you linearly convolve them (assuming that linear convolution is what you want, which is usually the case).
There are a few things to consider before you decide to zero pad your time-domain signal. You may not need to zero pad the signal at all!
1) Lengthen the time-domain data (not zero padding) to get better resolution in the frequency domain.
2) Increase the number of FFT points beyond your time-domain signal length (zero padding) if you would like to see better definition of the FFT bins, though it doesn't buy you any more true resolution. You can also pad to get to a power of 2 number of FFT points.
3) When fiddling with the FFT points (in the previous point), make sure your frequency points end up where you want them. The spacing of the points is $f_s/N$, where $f_s$ is the sampling frequency and $N$ is the number of FFT points.
There are some nice figures illustrating these points at http://www.bitweenie.com/listings/fft-zero-padding/
One last thing to mention: If you zero pad the signal in the time domain and you want to use a windowing function, make sure you window the signal before you zero pad. If you apply the window function after zero padding, you won't accomplish what the window is supposed to accomplish. More specifically, you'll still have a sharp transition from the signal to zero instead of a smooth transition to zero.
In general zero-padding prior to DFT is equivalent to interpolation, or sampling more often, in the transformed domain.
Here is a quick visualization of how the opposite works. If you sample a bandlimited signal in time at higher rate, you get a more 'squashed' spectrum, i.e. a spectrum with more zeros at both ends. In other words, you can obtain more samples in time by simply zero-padding in frequency after DFT'ing, and then IDFT'ing the zero-padded result.
The same effect holds in reverse when zero-padding occurs in time. This is all because the perfect signal reconstruction is possible as long as a signal is bandlimited and sampled at least at the Nyquist rate.
The term 'resolution' depends on how you define it. For me, it means how well the two adjacent points of observation in time or frequency can be reliably (statistically) discriminated. In this case the resolution actually depends on the DFT size due to spectral leakage. That is, smaller the window size, more blurry or smeared the transformed signal, and vice versa. It is different from how often you sample, or what I term 'definition'. For example, you can have a very blurry image sampled at high rate (high definition), yet you still cannot obtain more information than sampling at lower rate. So in summary, zero-padding does not improve resolution at all since you do not gain any more information than before.
If one has any interest in the spectrum of the windowing function used to isolate the time-domain sample, then zero-padding WILL increase the frequency resolution of the windowing function.
If the time signal is $x(t)w(t)$, where $w(t)$ is the windowing function, then the overall spectrum is $X(f) * W(f)$, where $*$ inidicates convolution.
If your windowing function is a simple rectangle (an extraction of some set of values from $x(t)$. Then $X(f)$ is the sync function. So, for example, if Nfft is the same as the width of your rectangle, and you had a sinusoid at precisely one of the bin frequencies, then the samples of the sync function that would appear centered on that bin happen to fall exactly at the off-peak zero crossings, and you don't see the shape of the sync in the spectrum at all. If you now zero pad your data going into the FFT, you'll see some samples at places other than the peak and the zero crossings, revealing the shape of the sync function in the resulting spectrum. So of what use is zero-padding? It is certainly of educational use in revealing the nature of the discrete transform of windowed signals, which is the usual case. In a practical sense, it could be useful in any case where you are interested in the spectral shape of an isolated envelope riding on a carrier wave.
There can be different reasons for this depending on any processes carried out before and after the Fourier transform. The most common reason is to achieve greater frequency resolution in any resulting transform. That is to say that, the larger the number of samples used in your transform, the narrower the binwidth in the resulting power spectrum. Remember: binwidth = sample_frequency/transform_size (often called window size). You can imagine from this, that as you increase your transform size, binwidth reduces (=better frequency resolution). Zero padding is a way of increasing the transform size without introducing new information to the signal.
So why not just take a bigger transform without zero padding? Would that not achieve the same effect? Good question. In many cases you may want to analyze a stream of time domain data, for which you may be using a short time Fourier transform (stft). This involves taking a transform every N samples according to the time resolution you need in order to characterize changes in the frequency spectrum. Here in lies the problem. Too big a window and you will lose time resolution, too small a window and you will lose frequency resolution. The solution then is to take small time domain windows giving you good time resolution and then zero pad them to give you good frequency resolution. Hope this is useful for you
I did not explain this well. I should have clarified it better. Referring to a windowed transform, indeed you get no 'actual' greater frequency resolution but for visualisation purposes (reading the power spectrum with the eye) it can provide clearer results. Using the critical sampling rate, each side lobe occupies a single bin, which depending on the graphing technique could be misleading. Zero padding provides an interpolated frequency spectrum which may be more revealing. Additionally, if you are using a simple peak picking method for frequency estimation, the spectral interpolation effect of zero padding will give you a spectral sample closer to the true peak of the main lobe. This link provides some useful diagrams: http://www.dsprelated.com/dspbooks/sasp/Practical_Zero_Padding.html
This answer is not correct. Zero-padding does not improve frequency resolution at all; it merely interpolates between the outputs of the smaller transform. You can think of zero-padding as adding more frequency bins that have the same bandwidth as they do with the smaller transform; therefore, from a filter bank perspective, their passbands overlap.
If it helps understanding: You can also do the opposite: take the FFT of a signal, then zero-pad the result, and inverse FFT. This will have the effect of interpolating the original signal. But of course the signal will still be the same signal, with the same Nyquist bandwidth. Interpolation won't give you more higher frequency information than was originally there.
I did not see these mentioned in the prior good responses so I will add the following additional important reasons for zero padding:
Radix-2 algorithms are more efficient so zero padding out to the next power of 2 (or power of 4 in some cases for radix-4), or more significantly avoiding any large prime factors can improve real time performance. Also when using the FFT for analysis, zero padding is often done to compute samples of the DTFT, such as determining the frequency response of a FIR: compare fft([1 1 1 1]) to fft([1 1 1 1], 512) which is identical to freqz([1 1 1 1]).