# Science and Math for Audio Humans – Applications of Phase

by Danny Maland

Buckle up, folks - I'm about to do some disclaiming!

Everything that I set before you should be read with the idea that “this is how I've come to understand it.” If somebody catches something that's flat-out wrong, or if you just think that an idea is debatable, please take the time to start a discussion via the comments.

Phase effects happen all the time to pro audio types, often without our intention. For instance, if I generate a sonic event via a loudspeaker in a room that is even a tiny bit reflective, I will experience phase effects. The sonic event traveling directly to me from the loudspeaker will arrive at my position at some time, and a reflection of that same sonic event will arrive at my position at some later time. Of course, "later" may be a matter of microseconds, but it's still later.

An unfortunate mentality that can sometimes take hold in audio people is the idea that phase effects are inherently undesirable. To the neophyte, the phrase "out of phase" seems like a bad thing. It sounds even worse when it's coupled with "destructive interference." The newly minted practitioner of sonic wizardry, when inclined to think this way, is robbed of basic understanding and a useful tool for their bag of tricks. Phase is an awfully handy thing – in fact, destructive interference as a result of phase can be a very handy thing. Why?

First, consider the fact that you, as a human, are very likely to own a model of the most sophisticated and swift real-time audio processing system created to date. That processing system is your brain, which has two pretty good (if still limited) audio capture and transduction devices attached via a high-speed network. Those ears of yours are constantly talking to your brain via the auditory nerve, and because you probably have two of them, some pretty neat differential analysis can be done.

Figure 1 is a picture to help you visualize the next paragraph. The human head and ears are (very) roughly approximated, with a sound source a short distance away from the head.

• Because it has two transducers to work with, the brain can deduce quite a bit. The brain will notice the raw difference in SPL between the two ears, plus whatever diffraction and absorption effects are caused by having the head in the way. Just as important, however, is the time arrival difference between the left and right ear. If we assume that the sound pressure waves involved are traveling at 1125 feet (13500 inches per second), then the distance to the nearest ear takes about 0.22 milliseconds to traverse. Travel to the far ear takes 0.62 ms. That 0.4 ms delay is half the cycle of a 1250 Hz wave. We don't hear comb filtering because we don't simply sum the inputs at our two ears. Instead, the two inputs are compared, and the brain can use all of this "differential" information to figure out where a sound is coming from.
• So, what about my claim that “destructive” phase relationships can be helpful? Well, let's start with a very basic microphone design. The mic is constructed so that the back of the diaphragm is sealed off in a soundproof can. This is what is termed a “pressure” microphone, because any sufficiently large sound pressure wave, from any angle, will cause a difference in pressure between the front and back of the diaphragm. The pressure wave is unable to act on the rear of the diaphragm in a significant way, and so it cannot cancel itself by, say, approaching the diaphragm from the side. Sound pressure waves approaching from the rear of the enclosure will ultimately move the diaphragm as well, as long as they can diffract around the soundproof can. The microphone is omnidirectional, and while this is helpful in some cases, it is often a major hindrance A much more useful device for most pro audio folks is a directional transducer. To get a directional transducer, we have to make the device a “pressure gradient” microphone – that is, a mic that senses the difference in pressure between the front and rear of the diaphragm, as opposed to pressure only.
• The simplest example of this is a ribbon mic. Ribbon mics that do not damp or soundproof one side of the ribbon in some way display a figure-eight response, because they are open to free air on both sides. Sound arriving “edge on” pushes on both sides of the ribbon equally, and with no instantaneous net displacement, the microphone produces no signal. However, we usually want something even more selective. We want to pick up a lot of signal up front, and as little from the back as possible. A ribbon mic might not be rugged enough for what we want to do, and so we have to go back to our “mic in a can” idea – except that now, we don't seal the can. Instead, we introduce rear phase delay ports.
• If all we did was drill some holes in the can, we would still get a figure-eight pattern, but if we add acoustical materials and internal geometry we can create an acoustical labyrinth. The whole point of the acoustical labyrinth is to delay a sound arriving at the back of the mic such that reaching the diaphragm takes the same amount of time as it does for the sound to travel around to the front of the diaphragm. I realize that the previous sentence might be a bit sticky, so here's the key:
• We create cancellation by, rather counter-intuitively, making sure that a wave at the rear of the diaphragm is producing the same pressure (say, “+1 pressure units") as a wave at the front of the diaphragm. If the pressures were +1 and -1, the diaphragm would move; the negative pressure at the back would move the diaphragm in the same direction as the positive pressure at the front. If we create a situation where the opposite is true, then we get no output from the microphone. There is no pressure gradient, as the air pressure behind the diaphragm is the same as the pressure in front.
• Figure 2 might help to make this a bit more clear.

Of course, a very simple set up like the one pictured would cause a nightmarish world of comb filtering. Real microphones have much more sophisticated rear phase delay implementations. These real-life solutions trade complete cancellation at a few frequencies for good cancellation across a wide frequency range, not to mention an overall sound that avoids the strange hollowness of obvious comb filtering.

Probably the most recently popular example of phase as both hurtful and helpful is the area of subwoofer deployment. Let's say that you have an area to cover where you could place stacks of speakers about 30 feet apart, or roughly 10 meters. (My acoustical prediction software only works in metric.) If you have four subwoofers, you could stack two each on either side of the deployment area, except...

Where the subs are arriving at the same time relative to the frequency cycle times involved, you get a lot of level. When you're close enough to one stack or another of subs that their SPL at your reference point is much higher than that of the other stack, you're also in good shape. However, if you're where the SPL levels of both stacks are similar, and one stack arrives substantially later than the other, you're not in good coverage.

A quick fix is to get all those subs clustered in a central location so that no matter where you are, their combined SPL and relative arrival times are very close. The coverage is much more even, but a lot of acoustical energy is getting thrown behind the subs, which may not be what you want.

If you have the processing available, and the space required, you can extrapolate a little bit from rear phase delay ports to create an end-fired subwoofer array. Because of the relative simplicity of the array, it tends to work best at a narrow range of frequencies, but the problems with this are mitigated by subwoofers not usually being asked to reproduce a very large passband anyway. The trick is to use physical spacing and digital delay to create a situation where the subs cancel their outputs behind the array, but sum their outputs in front.

The key with delay is to remember that applying it has the effect of pushing the delayed device away from your frame of reference. If you're standing at the first subwoofer, looking down the line, delaying a box pushes it away from the first subwoofer in time. However, a person at the end of the line looking at the first subwoofer perceives that the delayed device is being pushed towards the first subwoofer instead. If we decide, for example, to use a target frequency of 60 Hz, then we can calculate that a quarter wave is about 1.43 meters, or 4.17 ms. I'm using a quarter wave because, if we start with two boxes, spaced a quarter wave apart, and then delay the second subwoofer one quarter wave, it appears to be a half wave away from the first box when standing behind the array, but appears to be precisely aligned with the first box when in front of the array.

With just two subs in the array, we've already lost a lot of rearward spill, while maintaining smooth coverage to the front. Now, if we add a third sub, we can put it two-quarters of a wave away from the the first box, and then delay it by two-quarters of a wave. This does put the number three box in phase with the rearmost sub, but it also combines nicely with the number two sub. The result is a touch more spill directly to the rear, but a more focused area of that "back spill" and stronger output to the front.

Now, let's add our final sub. It's three-quarters of a wave away from box one, so three-quarters of a wave of delay causes it to be a half-wave out of phase from the reference point of box one.

Again, adding another sub has allowed us more forward coverage, while tightening up our spill to the rear sides. The tradeoff is a longer “tail” directly behind the array.

Although deploying an array like this takes up a lot of room (and power), and may not result in exactly the predicted performance (after all, doing this indoors is going to introduce a lot of factors not accounted for in this simple prediction), it's a great example of using phase as a tool to solve problems.