The way audio is stored on a CD (in a .wav file) is uncompressed. It is in a constant bitrate and always takes the same amount of storage space regardless of what audio is actually being stored. This is the reason that when you buy writable CDs they always state number of minutes of audio you can store on them (80 minutes for a typical 700MB disc). This means that if you have a 3 minute song file and a file that is just 3 minutes of silence they would take up the exact same amount of storage space (30 Megabytes).
Now let’s go back in time a bit to the turn of the century when ‘high-speed’ Internet wasn’t really something that everyone had, even if you did it was well under 1Mb/s (keep in mind that’s bits per second not bytes, a bit is 1/8 of a byte). Dial-up was the go-to for the majority of Internet users (at best 56Kb/s) so every bit of data you could save was crucial.
In 1999 the infamous Napster was released, the service that really kicked-off the music sharing era, but with a 3 minute song being a whopping 30 Megabytes a 56Kb connection would take at least 75 minutes while completely saturating your connection. That’s a long time for a single song.
Winamp + Napster (The go-to back in the day)
Now the .mp3 which was formally introduced back in 1993 now has a chance to spread its wings and begin its complete takeover as the de-facto standard in audio storage. The magic behind the .mp3 (and audio compression in general) is that it takes that original uncompressed file and using a complex algorithm, removes ‘unnecessary’ details that your brain wouldn’t really notice were missing.
Originally most files were encoded at 128Kb/s and at that level you could actually get a decent sounding file at a tiny fraction of the size of the original. Now that Internet speeds have drastically increased you would typically encode at 320Kb/s which is the highest bitrate that you can record an .mp3 at. Beyond that you run into the law of diminishing returns and it just isn’t worth it.
Now there is also something called ‘lossless’ compression. This is usually stored in the previously mentioned .flac (Free Lossless Audio Codec) format. A losslessly compressed file will still retain the exact quality of the original file but at a smaller file size. It does this by keeping all the sounds that exist in the recording but throwing away all the frequency’s that aren’t used. So, for example, if you record something like an audiobook where it is only the human voice, it would throw away everything outside of (roughly) 80-255Hz which is the typical frequency of human speech.
There is also another type of audio compression which is compression of dynamic range, but that’s a topic for another post…