Timeless Interview with Bob Ludwig of Gateway Mastering
Bob Ludwig is the best mastering engineer working today, and one of the true living legends of the music business. In addition to being a Grammy winning engineer, he has received many TEC Awards for excellence and was the first winner of the Les Paul Award from the Mix Foundation for setting the highest standards of excellence in the creative application of recording technology. We are very lucky and grateful to get some of his time to help us better understand contemporary recording issues.
Bob is a classical musician by training, having obtained his Bachelor's and Master's degrees from the Eastman School of Music, where he was also involved in the sound department and played Principal Trumpet with the Utica Symphony Orchestra.
Inspired by Phil Ramone when Phil taught a summer recording workshop at Eastman, Bob ended up working with Phil at the legendary A&R Recording Studios in New York. Together, they did sessions on projects with The Band, Peter, Paul & Mary, Neil Diamond, and Frank Sinatra.
After a few years at A&R Recording, Bob moved to Sterling Sound mastering studios, where he became a Vice-President. After seven years at Sterling, he moved to Masterdisk as Chief Engineer. In 1993, Bob and his wife Gail built Gateway Mastering in Portland, Maine, a state-of-the-art record-mastering facility that is the benchmark for quality mastering in recorded music. Bob and his Grammy-winning protegé Adam Ayan master records by the top artists every year, and Gail manages the studio.
Chris Castle interviewed Bob for the May 2010 issue of MusicTechPolicy Monthly.
Chris Castle: Is there a top 3 "don'ts" that you have to fix in mastering?
Bob Ludwig:
1. The most common big criticism I have is not paying enough attention to the vocal. The vocal is everything to the success of a song. Make it loud enough to be able to hear the lyrics. The problem is, if the vocal level is too high, all the energy of the track disappears, if it is too low, you can't understand what is being said. If you want to be able to hear every word and you are mixing it, be sure to have a friend who does NOT know the words come in and tell you what is being sung. Once you know the lyrics, you can mix them very low and still understand them, but everyone else might miss some important words. It is hard, but crucial to get the right level.
Always cover yourself by doing one or two extra mixes with the vocal raised +0.5dB and another +1dB. Some languages need extra vocal level as more nuances of the language can easily get lost. Louder vocals are usually found on country music mixes, French and Japanese mixes.
2. Vocal sibilance not contained is a problem. As in item "a", some producers will make the vocal as bright as musically possible in order to have it be intelligible yet tucked into the track. Sometimes the vocal is simply too sibilant. These days where most big projects are being cut for vinyl it is even more important to control sibilance as it creates high amplitude, high frequency grooves that are beyond the ability of all but the best cartridges to reproduce and one gets a "spitting" sound on the
sibilance. Controlling sibilance in the mix is by far the best place to do it as the de-esser will only affect the voice while de-essing during mastering necessitates compromising the brightness of the entire track.
3. A mix with a bright vocal and a dull drum sound is really a problem. The all important snare takes up a lot of spectrum and trying to brighten it with eq will make the bright vocal even brighter and quickly become unacceptable. It is a real trap that can only be helped by mastering from the TV track with a separate vocal a cappella track, something that most often is not an option.
Chris Castle: Given the prevalence of listening to music digitally, do you find that producers are mixing for earbuds? Is it common to find an "iPod mix"?
Bob Ludwig: No it isn't. Dr. Floyd Toole worked for Harman International (JBL speakers) and showed that averaging all the different consumer speakers (some bright, some with too much bass or midrange etc.) one ends up with a very flat curve which is empirical proof that mastering with an extremely accurate and flat playback system yields a product that sounds correct on more systems. Like speakers, earbuds run the gamut from the old stock Apple earbuds that sounded tinny and lacking warmth to top-of-the-line Shure earbuds that are extremely accurate, to "hip-hop" earbuds that are overly bass heavy. One must master to sound as good as possible on all systems.
Almost all pop mixes are mixed with the bass and kick drum panned to the center which is proper as many people will be listening on boom boxes which have limited power and having a powerful center channel bass available to both speakers is ideal. Very early recordings of the Rolling Stones and The Beatles (to name 2 groups) were totally intended for mono and were recorded on 2-channel or 3-channel tape decks solely for creating a mono only mix. When stereo became popular these early multi-track tapes were repurposed for stereo and the bass and kick drum were typically locked into either the right or left channel.
With earbuds and headphones this is very un-natural sounding and sometimes it is decided to filter the low bass into the center by mono-ing the signal somewhat. This sounds much better. This is definitely a decision based on current widespread use of earbuds, and it remains an important philosophical question when doing re-issues of old recordings with this problem.
Chris Castle: Can you explain how the "loudness" of a mix becomes a factor in mastering? Can you explain compression and how it affects you at the mastering stage?
Bob Ludwig: Compression uses a piece of hardware or software plug in which either enhances or most often limits the dynamic range of the music being fed into it. Compression is crucial to pop music. Live pop music is almost always performed at hearing damaging levels, way above the 85dBspl OSHA threshold for start of possible hearing loss. In order for this immense power
to be even somewhat realistically reproduced on consumer systems the pop sound pipeline must be compressed so that musically the performance has the extra energy that the live performance had. For pop music, this translates as a very musical thing.
This problem starts from the fact that human beings, when hearing two examples of the exact same musical program but with one turned up only +0.5 or 1dB, almost all listeners who don't know exactly what they are hearing choose the louder one as "sounding best". Fair enough.
So through the years, the louder example is eclipsed by a yet louder example winning the hearts and minds of the artist, the engineer and the A&R person. At some point, the music is so loud and un-naturally compressed that the aural assault on the ear, while very impressively loud, has sucked the life out of the music and makes the listener subconsciously not want to hear the music again.
At an AES workshop I was recently in about Loudness, Susan Rogers from Berklee College talked about the hair cells in our ears that receive music and she pointed out that loud compressed music does not "change" as much as dynamic music and notes that "We habituate to a stimulus if it stops changing. Change "wakes up" certain cells that have stopped firing. This is cognitively efficient and therefore automatic." In other words, there are very physical reasons why too much compression turns off our music receptors. Every playback system ever manufactured comes with a playback level control. If one is listening to an album, one should be able to turn that control anywhere you want and the absolute level on the CD should not make a difference. Another place level on a CD does NOT make the difference one would think is on radio broadcast. It can be shown that in general, loud CDs sound worse and less powerful on commercial FM radio than a CD with a moderate level that lets the radio station compressors handle the loudness problem. Non-classical radio station compressors make soft things loud and loud things soft.
Two areas where producers get upset about not having enough level is the iTunes Shuffle, or even comparing songs on the iTunes software itself, and that moment at the radio station where the PD is going through the weeks new releases and deciding which 2 or 3 songs will be added to his playlist. Here, sometimes having a little extra level can make a lesser song seem a little more impressive, at least at first listen.
A great example of a contemporary recording that has full dynamic range is the Guns 'N Roses "Chinese Democracy" CD where Axl Rose wanted all the textures of the original mixes to come through and he got his wish! A good example of one of the loudest most distorted CDs is the Metallica "Death Magnetic" CD where apparently 10,000 fans signed a web petition to have the album remixed because they got to hear how good it sounded on "Guitar Hero" which did not have all the digital limiters the final CD mix had.
Chris Castle: When recordings are made available digitally, they are often ripped from a CD and uploaded to iTunes or other services using a codec encoding/decoding software. What does this do to sound quality?
Bob Ludwig: It depends. AAC (iTunes) files sound better than MP3 files at similar bit rates. A 128 kbps CD import when done to MP3 has a watery interpolation sound that is usually pretty easily heard while the AAC file does usually does not have this problem. Only when comparing an unfamiliar AAC file to the CD or master does the lack of resolution become apparent.
The algorithm is difficult to predict as a long term average Fast Fourier Transform (FFT) shows near identical frequency response of the 128kbps file with the original (except for the most extreme high frequencies). Differences are found in impulse response, change of width ("soundstage") reverbs etc. Like all codecs, the AAC published standard is a playback only specification. So the playback is fixed, but scientists are free to try to improve the record side as much as possible and indeed after speaking with one of Apple's codec people I confirmed that the AAC encoder on the iTunes software is at least 10% or more better than it was several years ago.
I re-ripped a recording I had done 6 years ago and I was delighted with the improvement. When iTunes decided to double the standard encode to 256 kbps it significantly raised the sound quality bar. You will notice when Apple did this there were no ad campaigns that trumpeted "The new iTunes software... now with half the old capacity!!"
Believe me, the original decision to go with 128 kbps was the right decision to launch the new technology as that one decision had such an impact on having decent quality but with twice the storage capacity and, especially on the original hard drive iPods, twice the battery life one could expect from the current encode rates. It got the format off the ground. Now with solid state Flash drives instead of mechanically spinning hard drives one can still have decent battery life and use the AAC "Apple Lossless" encoder which indeed is bit for bit transparent to the original CD and zero loss of file quality.