Audio Noise Reduction in Windows Phone
This article shows an approach for audio noise reduction using Fast Fourier Transforms on Windows Phone.
Windows Phone 7.5
Contents |
Fast Fourier Transform
This section provides a very simple and broad overview of the Fast Fourier Transform - the minimum needed to understand how the noise reduction algorithm works. For a slighly deeper view, see Sound pattern matching using Fast Fourier Transform in Windows Phone.
Fast Fourier Transform computes the DFT and transforms a function from the Time domain (physical signals) into another, which is called the frequency domain representation - in short a spectrum graph showing the frequencies present in the sample. The inverse Fast Fourier Transform does the reverse, transforming the frequency domain back into a physical signal.
The FFT requires an input function that is discrete. Such inputs are created by sampling a continuous function, such as a person's voice, a song or an ambient noise. The algorithm only applies to signals comprising a number of elements which is equal to 2^{n} and returns a set of complex numbers, the spectral components. The number of FFT elements is equal to the size of the time sample.
The second half of these complex numbers corresponds to negative frequencies and contains complex conjugates of the first half for the positive frequancies, and does not carry any new information.
How Noise Reduction Works
First we use Fourier analysis to find the spectrum of pure tones that make up the background noise. For each windowed sample of the background audio, we take a Fast Fourier Transform (FFT) and then statistics are tabulated for each frequency band - specifically the maximum level achieved by at least n sampling windows in a row, for various values of n. This spectrum is referred to as the "fingerprint" of the static background noise in the environment.
When recording, we then take the frequency spectrum of each short sample of audio and compare it to our fingerprint. Pure tones in the sample that aren't sufficiently louder than the fingerprint are probably noise, so we reduce their values in the spectrum (this general technique is called spectral noise gating).
We then compute the inverse FFT of the sample's "noise-reduced" spectrum to convert it back to a time domain, and play the audio. The result is the original sound, but with the frequencies associated with the background noise much reduced.
Similar techniques are used in high end noise-reduction headphones; the main difference being that these will often dynamically calculate the noise in "real time" using a second microphone.
Working with FFT in Windows Phone
- Download Dsp.zip and extract DSP.cs from the zip. Then add it into your project. DSP.cs provides a namespace called DSP and a class FourierTransform containing a set of funtions to compute the FFT.
- Refer to How to access and manage the Microphone raw data in WP7 for full instructions on how to manage the microphone on WP7.
- Here a very good Microphone sample code, the same used in this article.
Don't forget to include the namespace DSP
using DSP;
Compute()
Is the function in the namespace delegated to compute the FFT. Here the signature:
void Compute(UInt32 NumSamples, Double[] pRealIn, Double[] pImagIn, Double[] pRealOut, Double[] pImagOut, Boolean bInverseTransform);
- NumSamples Number of samples (must be power two)
- pRealIn Real raw data samples
- pImagIn Imaginary (optional, may be null), to be filled when calculating inverse Fourier Transform
- pRealOut Real coefficient output
- pImagOut Imaginary coefficient output
- bInverseTransform bInverseTransform when true, compute Inverse FFT
Cutting the frequencies
First create the array of double to store the noise fingerprint.
private double[] fingerprint;
We need a DispatcherTimer in order to manage the noise fingerprint detection. The detection time set is 4 sec, though one can set it to any value. It is observered that longer time durations (>10 secs)dosen't improve result, rather other background noises may interfere and the net resultant (fingerprint) may not be optimal.
DispatcherTimer dtFingerprint;
// Timer to detect fingerprint
dtFingerprint = new DispatcherTimer();
dtFingerprint.Interval = TimeSpan.FromMilliseconds(4000);
dtFingerprint.Tick += new EventHandler(stopFingerprintDetection);
private void stopFingerprintDetection(object sender, EventArgs e)
{
dtFingerprint.Stop();
microphone.Stop();
MessageBar.Text = "Noise fingerprint computed.";
SetButtonStates(false, false, true);
microphone.Start();
UserHelp.Text = "Record";
StatusImage.Source = microphoneImage;
}
DispacherTimer is included in System.Windows.Threading namespace.
Into my .xaml I added a checkbox component to enable/disable noise reduction during recording.
<CheckBox Content="Noise Reduction" Name="cb_noise_reduction" ... />
When the record button is pressed we allocate the fingerprint array and start the timer for detection.
private void recordButton_Click(object sender, EventArgs e)
{
// Get audio data in 1/2 second chunks
microphone.BufferDuration = TimeSpan.FromMilliseconds(100);
// Allocate memory to hold the audio data
buffer = new byte[microphone.GetSampleSizeInBytes(microphone.BufferDuration)];
// Allocate memory to hold the audio data
fingerprint = new double[ FFT.FourierTransform.NextPowerOfTwo((uint) microphone.GetSampleSizeInBytes(microphone.BufferDuration))];
// Set the stream back to zero in case there is already something in it
stream.SetLength(0);
WriteWavHeader(stream, microphone.SampleRate); // To save in .WAV format
if ((bool)cb_noise_reduction.IsChecked)
{
dtFingerprint.Start(); // Start the noise finger print detection
}
else
{
SetButtonStates(false, false, true);
UserHelp.Text = "Record";
StatusImage.Source = microphoneImage;
}
// Start recording
microphone.Start();
}
On dtFingerprint timeout recording begins. Inside the Microphone.BufferReady event handler.
private double cutoff = 0;
void microphone_BufferReady(object sender, EventArgs e)
{
// Retrieve audio data
microphone.GetData(buffer);
int index = 0;
double[] sampleBuffer = new double[FFT.FourierTransform.NextPowerOfTwo((uint)buffer.Length)];
for (int i = 0; i < buffer.Length; i += 2)
{
sampleBuffer[index] = Convert.ToDouble(BitConverter.ToInt16((byte[])buffer, i)); index++;
}
if (dtFingerprint.IsEnabled)
{
MessageBar.Text = "Computing noise fingerprint";
double[] xre = new double[sampleBuffer.Length]; // Real part
double[] xim = new double[sampleBuffer.Length]; // Immaginary part
FFT.FourierTransform.Compute((uint)sampleBuffer.Length, sampleBuffer, null, xre, xim, false);
double spectrum = 0;
for (int i = 0; i < xre.Length; i++)
{
spectrum = (float)(Math.Sqrt((xre[i] * xre[i]) + (xim[i] * xim[i]))); // Magnitude
if (spectrum > fingerprint[i])
{
fingerprint[i] = spectrum;
}
}
}
else
{
MessageBar.Text = "Recording....";
double cMagnitude = 0;
// double cPhase = 0;
double[] xre = new double[sampleBuffer.Length]; // Real part
double[] xim = new double[sampleBuffer.Length]; // Immaginary part
double[] ixre = new double[sampleBuffer.Length]; // Real part
double[] ixim = new double[sampleBuffer.Length]; // Immaginary part
double[] fftoutput = new double[sampleBuffer.Length];
byte[] output = new byte[buffer.Length];
FFT.FourierTransform.Compute((uint)sampleBuffer.Length, sampleBuffer, null, xre, xim, false);
for (int i = 0; i < xre.Length; i++)
{
cMagnitude = (float)(Math.Sqrt((xre[i] * xre[i]) + (xim[i] * xim[i]))); // Magnitude
if (cMagnitude < (fingerprint[i] ))
{
xre[i] *= cutoff; xre[(xre.Length - 1) - i] *= cutoff;
xim[i] *= cutoff; xim[(xre.Length - 1) - i] *= cutoff;
}
}
FFT.FourierTransform.Compute((uint)xre.Length, xre, xim, ixre, ixim, true);
index = 0;
short tmp = 0;
for (int i = 0; i < buffer.Length / 2; i++)
{
tmp = (short)ixre[i];
output[index] = (byte)((short)tmp & 255); output[index + 1] = (byte)((((short)tmp) >> 8) & 255);
index += 2;
}
// Store the audio data in a stream
//stream.Write(buffer, 0, buffer.Length);
stream.Write(output, 0, output.Length);
}
}
Downloads
- DSP.zip has the full example code.
Summary
The article has shown an approach how to cut off some frequencies from your audio sample focused on Windows Phone. The theory is also valid for Qt/Symbian and S40 platforms.
Contents
Galazzo - Improvements
Hi All,
I made improvements to the sample code to work with better performances, now works perfectly.
The sample is available for download.
I'm available for further explains.
Sebastianogalazzo 19:42, 11 September 2012 (EEST)
Hamishwillee - Awesome
Worth explaining what was changed so that people can understand what "makes a difference"?hamishwillee 05:16, 12 September 2012 (EEST)
Galazzo - Improvements
Hi All,
improvements involves performances. I moved the array memory allocation from local to global, saving the memory management computations. Nokia Lumia 800 has impressive performances but to optimize the code it's always useful to perform well in all conditions.
An important fix regards the microphone's management. I noticed that after the fingerprint computation, being the microphone still active, sometimes there were misalignments in recording audio into the stream buffer. Solution is very simple now, but was hard to find ( days ... ). After stopping the fingerprint timer ( dtFingerprint.Stop() ) I stop the microphone, compute the fingerprint ( for example if you want to apply filters ) restart the microphone.
This could be useful for other projects.
The algorithm is unchanged as it works good except that I introduced the cutoff of amplitudes params ( xre and xim arrays ) as I got better performances and introduced the cutoff global variable in order to control this important parameter from UI if you want. I have used a value for cutoff equal to zero that means I delete the 100% of the involved the frequency, but you can set a value less than 1 es. 0.3 to cutoff the frequency to 30% or greater than 1 es. 2.0 to enhance the frequency.
In the next days I hope to improve the article providing more informations on cutting frequencies and showing examples on how to apply filters to fingerprint and uploading audio samples.galazzo 10:52, 12 September 2012 (EEST)
Hamishwillee - Thanks!
Wow - thanks for the update. I suggest you add an in source comment explaining why you have turned the microphone on and off above.
Just gave this a subedit to improve the "How Noise Reduction Works" section. I think its much simpler and less repetetive - please confirm you are happy with it. There are a number of other minor fixes too.hamishwillee 04:19, 13 September 2012 (EEST)
Galazzo - Subedit
Hi Hamish,
I think your subedit improved the section.
Thanks for your contribution, happy for that :-)
Sebastianogalazzo 14:08, 13 September 2012 (EEST)