Digital Audio Basics #1: What You Need to Know

Nov 09, 2021

Understanding how digital audio works including sample rate, bit depth and A-D conversion can ensure a smooth recording experience. Find out what you need to know!

By Craig Anderton

Digital Audio Basics #1: What You Need to Know

This is the first piece in a two-part series on digital audio basics. Read part #2 here.

Digital audio has made it possible to create high-quality music productions with budget computers and software. Amazingly, today’s bedroom studio is sonically equivalent to the quarter-million-dollar studios of yesteryear (and sometimes even better).

But along with improved technology, digital audio has introduced new concepts and terms—and it’s important to understand them to get the best results from your recordings. In Part 1 of this two-part series, we’ll cover how digital audio works. In Part 2, we’ll cover best practices, so you can get the most out of digital recording.

1. Audio by the Numbers

All audio—even a symphony orchestra—is a single sound wave with a varying level. A vinyl record’s groove captures that wave’s shape, and a turntable’s needle vibrates as it follows the wave. Amplifying those vibrations produces sound by reproducing the original waveform.

But analog audio has limitations. Records wear out, dust and scratches interfere with sound quality, and turntable needles and vinyl records have physical limitations that make it difficult to reproduce sound accurately.

Digital audio solves those problems by using a device called an analog-to-digital converter to convert audio into numbers while recording. On playback, a digital-to-analog converter converts the numbers back into audio. Numbers aren’t subject to the same kind of physical deterioration as an analog waveform. Audio that’s expressed as numbers can be copied, transferred, modified and even cloned without degradation. Fig. 1 shows the process that converts audio into numbers and then back into audio.

Figure 1: How audio turns into numbers and back into audio again.

1A is the original analog waveform. In 1B, the analog-to-digital converter takes a “snapshot” of the signal’s level every few microseconds (1/1,000,000th of a second). The number of measurements the converter takes each second is the sample rate, also called sampling frequency. In 1C, the computer translates this series of snapshots, or samples, into voltage levels (i.e., numbers) that represent the signal’s level variations. The audio has now been frozen into a series of numbers. Even better, these numbers can be manipulated mathematically in a digital audio editing program to change levels, add special effects like delay, and so on.

We can’t listen to numbers, so in 1D, the digital-to-analog converter converts the digital data back to a series of voltage levels. However, there are extremely tiny steps when the levels change, so a final smoothing filter rounds off the steps to restore the waveform’s original analog shape.

Although this might seem complex, compared to analog audio, there are several advantages. Unlike recording to tape, numbers recorded into a computer don’t deteriorate. With digital recording, when it’s time to mix down, you mix down numbers to stereo or surround, which creates another set of numbers. So, the stereo mix will represent exactly what you heard when you mixed the song. Then you can stream that digitally mixed string of numbers over the internet, copy it over to a smartphone’s memory, press the bits into a Compact Disc, and so on. A transfer isn’t just a copy of the original sound but a clone.

2. Digital Audio Hardware

Converting between analog and digital requires a device called an audio interface. Sound sources like microphones and guitars produce analog audio, and after being digitized, we won’t hear the audio until it becomes analog again.

Audio interfaces incorporate the converters that translate analog to digital and back to analog again. Fortunately, digital audio manufacturing processes have become so good that even low-cost audio interfaces achieve a level of quality that would have been unthinkable in the early days of digital audio. More expensive interfaces will have features like higher-quality mic preamps and premium analog-to-digital and digital-to-analog converters. But there’s a point of diminishing returns. A $1,000 audio interface will likely sound better than a $100 interface, but it almost certainly won’t sound ten times better.

3. Recording Resolution

Sampling audio and measuring its levels is a start, but those measurements need to be as accurate as possible. Bit resolution specifies the accuracy with which an analog-to-digital converter measures an input signal.

A good analogy is a ruler’s calibrations. A ruler calibrated in inches can measure only inches with certainty, but a ruler calibrated in sixteenths of an inch can measure the length with 16 times better resolution. A ruler calibrated in thirty seconds of an inch is even more accurate.

With digital audio, each sample measures a signal’s level at that instant. The more precise the measurement, the more accurate the conversion from analog audio into digital data, and the better the resolution. Resolution depends on the number of bits, which have a basis in binary math. It’s not necessary to know how any of this works to record music, but let’s revisit the concept of measuring with a ruler.

Think of bit resolution as specifying the calibrations for measuring digital audio signals—more bits would be like having more calibration marks on a ruler. 4 bits can measure 16 different values, so it would be like measuring sixteenths of an inch. A CD’s 16-bit resolution can measure 65,536 values, and 24 bits can measure 16,777,216 values. In theory, 24 bits can measure levels with 256 times greater precision than 16 bits.

Resolution varies among audio systems. Higher resolution requires more memory to store larger numbers, as well as accurate analog-to-digital conversion to take advantage of the higher resolution. Because memory and high-quality converters have become less expensive, gear, in general, has gravitated toward higher bit resolutions.

For example, an audio greeting card may have audio with only 4 bits of resolution. Early digital audio systems used 8 bits. 12-bit samplers were common, and 12 bits was considered the minimum acceptable resolution for working with digital audio. CDs use 16-bit resolution, and high-resolution audio uses 24-bit resolution. The audio examples show the differences between different resolutions.

Although a 24-bit file requires 50% more storage than a 16-bit file (assuming both are at the same sample rate), recording engineers prefer 24-bit recording to 16-bit recording. With the ever-declining cost of memory, home studios can afford to work with 24-bit resolution—the days of a 1 GB hard disk costing $2,000 (yes, that really was true!) are way behind us.

4. Distortion and Bit Resolution

If you can’t measure a signal accurately, then you can’t reproduce it accurately—so low bit resolutions can create distortion. However, unlike distortion in the physical world (which tends to increase with higher signal levels), digital distortion increases with lower signal levels because there are fewer bits available for measuring the level (fig. 2).

$Figure 2: A fixed amount of resolution means that a high-amplitude signal (left) uses all the available precision. A low-amplitude signal (right) uses a smaller fraction of the available precision$

Figure 2: A fixed amount of resolution means that a high-amplitude signal (left) uses all the available precision. A low-amplitude signal (right) uses a smaller fraction of the available precision.

Fortunately, a technique called floating-point math can turn this into a non-issue by essentially expanding the resolution for lower-level audio. Also, a technique called dithering (described in Part 2) can reduce the perceived amount of distortion on playback. Most importantly, the audio inside your program isn’t bound by hardware’s rules and can have essentially unlimited resolution when it’s being processed inside the computer.

5. Sampling Rate

If the system doesn’t take samples of the signal level at a fast enough rate, it’s harder to reproduce a signal accurately. Audio for CDs is sampled at 44.1 kHz, which means the system samples the audio 44,100 times per second. This is the minimum needed to reproduce frequencies from 20 Hz to 20 kHz (the maximum range of human hearing).

The tradeoff is that higher sample rates require more storage and may limit the number of tracks you can record for a given amount of computing power. Most recording enthusiasts use either 44.1 kHz or 48 kHz, two standard sampling rates, for their projects. If you have the necessary computing power and storage, recording at 96 kHz follows the current standard for high-level recording facilities. But in most cases, it’s not necessary. We’ll cover an exception in Part 2.

Some audio uses lower sampling rates, like 22.050 kHz or even 8 kHz. These lower sample rates are intended for applications like dictation. As you’ll hear in the following audio examples, 11 kHz isn’t as good as 44.1 kHz, and 6 kHz is totally unsuitable for music.

6. Digital Audio Quality

It all sounds perfect, right? Well…several factors can impact quality. Usually, tradeoffs relate to cost—for example, cheap analog-to-digital converters don’t matter in a toy that makes sound, but they matter when you’re recording music. Another tradeoff is file size. The numbers created by digital recording need to be stored in memory and processed by a computer. A data-compressed format like MP3 trades off file size for quality so that downloads from the internet take less time—but don’t sound quite as good as files that aren’t data compressed. In Part 2, we’ll cover best practices to take maximum advantage of digital recording.

Musician/author Craig Anderton is an internationally recognized authority on music and technology. He has played on, produced, or mastered over 20 major label recordings and hundreds of tracks, authored 45 books, toured extensively during the 60s, played Carnegie Hall, worked as a studio musician in the 70s, written over a thousand articles, lectured on technology and the arts (in 10 countries, 38 U.S. states, and three languages), and done sound design and consulting work for numerous music industry companies. He is the current President of the MIDI Association www.craiganderton.org.