Links on Android Authority may earn us a commission. Learn more.
Audio format guide: MP3, M4A, AAC, FLAC, and more
When the MP3 player took off in the late 1990s, the format itself entered the public consciousness in a way not many others have — with perhaps the Word document being an exception. But what is an audio format, anyway, and why should you care?
This guide will cover some of the most popular formats used by audio streaming services today and explain their differences.
What is an audio file format?
A digital audio file is how recorded content gets saved on a computer, media player, smartphone, or other device. Digital audio is, at its most basic level, a series of numbers that a device can use to recreate sound waves. There are various ways to accomplish this and then compress (or not) the resulting data. We know that by sampling a sound wave in the process from analog to digital conversion with at least 16-bits at 44.1kHz that we can perfectly reproduce the captured signal again later on. This is thanks to some math called the Nyquist-Shannon sampling theorem. We can achieve higher bitrates and frequency ranges, but whether anyone can hear a difference — even though the best headphones — is debatable at best.
If we just save that data as is (known as pulse code modulation or PCM), the file takes a large amount of space. That’s why both lossy and lossless forms of audio compression have been developed. Lossy audio throws out audio frequencies that our ears cannot hear while lossless preserves them all. Lossy audio formats can also use other tricks to compress audio even further, which we’ll cover a little later.
Because most people these days access their music via streaming services, compressed, lossy file formats are the predominant way content gets distributed. That’s fine if you’re casually listening, but some people demand the utmost in quality. As a result, an increasing number of high-quality and even lossless streaming options are now available. But there’s no getting around the fact that lossy formats take up less space and eat up less mobile data, as the chart below makes clear.
|Stereo file sizes (16-bit 44.1kHZ)||WAV||AIFF||FLAC (typical)||MP3 (320Kbps)||MP3 (192Kbps)|
|Stereo file sizes (16-bit 44.1kHZ)|
|Stereo file sizes (16-bit 44.1kHZ)|
|Stereo file sizes (16-bit 44.1kHZ)|
The MP3 audio file format once reigned supreme when it came to downloading music. In fact, the format is so synonymous with mobile music solutions that “MP3 player” is now generic for an audio playing device. However, these days it’s less prominent for a variety of reasons. It’s still hanging on, though. Understanding MP3 files can help us understand other formats more easily as well, so we’ll start here.
An MP3 file is a lossy audio file, meaning it discards data our ears cannot hear. Almost every human has a hearing range somewhere in the range of 2oHz to 20kHz. The upper limit actually decreases with age, but in general that’s the range within every noise you’ll ever hear lies. Because we know other frequencies are therefore superfluous, MP3 discards all frequencies outside this range.
In order to further save some space, MP3 files use even more tricks. Audio engineers use noise shaping algorithms based on psychoacoustic effects of the human ear and brain to remove parts of music that we shouldn’t be able to hear. For example, the brain can’t differentiate between two frequencies that are right next to each other. Furthermore, the adult human ear struggles to identify the direction of high frequency sounds. It also begins to lose sensitivity above 16kHz. Plus, loud sounds can mask quieter ones. All these can be removed with little to no noticeable difference to the end listener.
Basically, MP3 files remove frequencies we cannot hear and frequencies we could hear in isolation, but cannot because of the way they're combined in a particular song.
An MP3 splits up a track up into 576 sample frames, and Fast Fourier Transforms (FFT) are used to obtain frequency data from these frames. The frequency data is then analyzed to see if any opportunities exist to apply the compression rules based on human hearing as described above. If so, these portions are rounded down (quantized) to lower bitrates, which helps save space. Data regarding how to restore each frame to its full sound wave representation gets saved to a 32-bit header.
The bitrate determines the maximum allowed file size for each frame. The more aggressive the compression, the more likely the algorithm is to remove something that is audible. Furthermore, this type of filtering and cutting isn’t perfect and the quantization can leave behind artifacts that some people can hear. This lossy psychoacoustic compression is then followed up by a lossless Huffman coding compression that is similar to .zip file to save more space.
If that sounds too complicated, the takeaway is that MP3 files remove frequencies we cannot hear and ones we theoretically could hear in isolation, but cannot in a particular song due to auditory masking. This can lead to quite small file sizes. If it is done too aggressively or with a bitrate that is too low, though, quality can suffer. As a result, MP3 isn’t all too popular anymore for streaming.
AAC, M4A, and OGG Vorbis audio formats
Audio compression can take many forms, and other formats have been developed. These use slightly different algorithms and techniques to accomplish the task, so we can’t compare them based on bitrate alone.
OGG Vorbis is an open-source alternative to MP3. It still uses FFT and similar methods to analyze and quantize mask-able frequency information but employs a different algorithm. Vorbis also takes the noise floor into account to improve low bitrate performance. Spotify uses this format at 320kbps.
There’s also AAC, which is used by Apple Music, Tidal, Pandora, and YouTube Music. It’s an evolution of the MPEG (MP3) format and allows for higher sample rates up to 96kHz. Plus, it can dynamically switch frame lengths between 1024/960 or 128/120 samples for better resolution when required. It performs better at lower file sizes than MP3s, to boot.
Another file type you might encounter is the M4A file. These files are encoded using the AAC format, and then stored in an MPEG-4 container, hence the file extension .m4a. Apple created this type as a response to MP3. While not quite as universally supported, it isn’t rare by any means.
For these reasons, you can’t directly compare bitrates and claim a higher bitrate would be a better-sounding file between AAC and MP3, for instance. Lower bitrate AAC and M4A files can still sound good while taking up less space.
That makes formats like OGG Vorbis and AAC appealing for streaming services. They can deliver higher-quality sound while consuming less of your mobile data.
If you don’t want to throw out any frequencies but still want a file that’s smaller than raw data, that’s where FLAC comes in. FLAC does not discard any part of a recording, and therefore it’s called lossless. Apple’s version of a lossless codec is called ALAC. Both of these codecs function rather like a .zip file. If you’ve ever zipped and then unzipped a collection of files, you’ll understand the basic idea. Nothing gets removed, the FLAC file just looks for ways to consolidate repeating patterns and data and those are then reconstructed upon playback.
Still, FLAC files will never be as small as MP3 or AAC files. But as bandwidth gets cheaper and more accessible, an increasing number of streaming services offer the ability to stream using FLAC. These are often “HD,” “Ultra HD,” or “HiFi” subscriptions. Amazon Music Unlimited, Tidal HiFi and HiFi Plus, Deezer Premium, and Qobuz all offer FLAC streaming.
Be aware that FLAC files are larger than lossy formats and can eat up a lot of your data. If you save them to a device, they’ll also start taking up storage space pretty quickly.
WAV and AIFF audio formats
Audio recordings can be just pure PCM saved to a device, which is essentially what WAV (on Windows) and AIFF (on Mac) are. They represent some of the earliest forms of storing digital music. These files have no compression or anything else applied to them. In fact, you can find out their file size pretty easily with the following equation:
PCM Size = sample rate X (bits per sample /8) X time in seconds X number of channels
As a result, these formats can lead to incredibly large file sizes. That means they’re rather rare for streaming and downloading, although services like HDtracks do offer them. What these files are really useful for is in audio mixing and editing. Because no conversion, compression, or anything else has taken place, it’s easy and quick to edit tracks, save them, and then edit them again as required.