AI-powered audio devices are transforming the engineering landscape. From analyzing speech patterns, tone, and cadence to detecting anomalies or enhancing user experiences, these devices are pushing the boundaries of what audio technology can achieve.
Imagine detecting depression or anxiety from voice recordings by analyzing speech patterns, tone, and cadence. Using AI-powered headsets to issue commands or dictate notes in noisy environments like construction sites or factories.
These are just a couple of examples showing how AI is transforming human interactions by deciphering audio. The possibilities are endless!
But there’s one constant challenge that remains - noise cancellation
You might think, “Isn’t that a problem of the past? Don’t we already have advanced AI tools that cancel out noise at the press of a button?” Sure, for simple use cases, they work great. But when it comes to complex, real-world scenarios, like handling emergency calls in chaotic settings or transcribing conversations in loud public spaces, noise becomes a whole new beast.
Noise has always been the audio engineer’s worst nightmare. Now, AI is the one trying to make sense of it all, and the challenge has only grown. While today’s tools can deliver impressive results for straightforward tasks, complex use cases demand far more. They require thoughtful, sophisticated engineering solutions to overcome the limitations of existing technology. And that’s where innovation truly comes into play.
The challenge with noise cancellation
The use case in the project that we worked on was, AI should transcribe casual conversations between individuals. It has to determine the individuals by identifying different speakers in the environment. This information is further processed with AI.
One of the biggest challenges in recording conversations of desired individuals is that most of the microphones available in the market are omnidirectional. These mics pick up sound from all directions at the same levels of intensity. This makes it difficult for AI to decipher casual conversations since there might be overlapping conversations coming from areas which are not of our interest. Hence, we opt for a unidirectional mic which is a combination of more than one mic.
For instance, we could achieve a cardioid pattern using an omni & dipole (captures audio in the front and rear while mitigating the sounds coming from other angles) mic. But even these unidirectional mics are not perfect if the desired and undesired sound comes in the same directions – And they can still pick up background noise and other sounds, but at a comparatively reduced intensity.
But aren't they unidirectional?
Newsflash: Unidirectional mics don’t work in just one direction- they are designed to be most sensitive to sound from the front, while reducing the intensity of sound from other directions. So, don’t assume that a unidirectional mic will completely reject/eliminate background noise.
So, how do we overcome these challenges?
We use a combination of techniques to achieve the desired output. First, we use multiple mics to minimize the ambient noise. This helps us capture the audio from the front while reducing the intensity of sound from other directions.
But There's More
Even with beamforming, managing reverberation and echo remains a challenge when placing microphones within enclosures. This is where digital signal processing (DSP) comes into play. We use DSP techniques, such as applying low-pass or high-pass filters, to remove unwanted frequencies and reduce the impact of ambient noise. Additionally, we must carefully balance the normalization of loudness and amplification to ensure that the desired audio is preserved while minimizing unwanted sounds, achieving the optimal recording quality.
There are several trials and errors involved in understanding the right enclosure material, size, and air gap within the cavity to minimize reverberation and echo while maintaining audio quality. We rely on cost-effective frugal prototypes to refine and find the optimal solution.
Noise cancellation is a complex challenge that demands a combination of techniques and meticulous attention to detail. By using multiple microphones to create a unidirectional pattern through beamforming, applying passive filtering, and leveraging digital signal processing to further refine the audio, we can isolate the desired audio and minimize unwanted noise, achieving the optimal output.
AUTHOR
Geethanjali R
Electrical Hardware Engineer, Srushty Global Solutions
As an accomplished Electrical Hardware Engineer, she focuses on the design and development of cutting-edge electronic systems. With a strong background in circuit design and embedded systems, Geetha plays a pivotal role in driving innovation and ensuring the reliability of our products. She has successfully led multiple projects that have enhanced product performance and efficiency. Her analytical mindset and attention to detail enable her to tackle complex engineering challenges effectively.