File Compression

I don’t think we would be such a fun place without file compression. A core part of how the web works, file compression allows netizens to transfer files that would otherwise take too much bandwidth and time. Whenever you view a JPEG image (I don’t know about you, but I love viewing images of flying unicorns) or access ZIP files, you’re benefiting from file compression.

Thus, most of the readers (if not all), at some point, must’ve asked this question: how does file compression work? In this article, I will answer this question. So, what are we waiting for? Let’s begin.

What Does Compression Mean?

To put it in simple words, file compression is the act of reducing a file’s size while preserving the original data. So, the lesser the size, the lesser space the file will take up on a storage device, while also making it easier to transfer it over the internet or otherwise. 

However, you should know that there is a limit to which you can compress a file. It is not infinite; you can’t keep compressing the file to reduce the size to nothing. 

File compression is generally split into two main types:

  • Lossy 
  • Lossless

Let’s see how both of them work.

How File Compression Works: Lossy Compression

This type of compression reduces the size of a file by removing unnecessary bits of information. Many common formats for image, video, and audio use this sort of compression; some of the well-known examples include mp3 and jpeg. It’s common in these types of media formats that in these types of formats, a perfect representation of the source media isn’t necessary.

As mentioned above, mp3 uses this type of compression. Not all the audio information from the original recording is included in an mp3 file; some of the sounds inaudible to humans are removed from an mp3 file. Since they are inaudible to humans, it wouldn’t matter if they are missing; thus, removing that information results in smaller file size with basically no drawbacks.

Another example that I mentioned above was a jpeg, as this image format also removes those parts of images which are non-vital. Let’s say there is a picture of a blue sky; instead of using a plethora of different shades of blue, jpeg compression will change all the sky pixels to just one or two shades of blue.

However, the quality of a file is inversely proportional to the compression of a file. In other words, the drop in quality of a file will become increasingly noticeable as you further compress a file. You’ve probably experienced this with pixilated jpeg files or muddy MP3 files downloaded from the internet. 

You should opt for this type of compression if your file contains more information than what you need for your purposes. Let’s say you have this humongous RAW image file. While you probably want to preserve that image’s quality when printing it onto a large banner, it’s quite unnecessary to upload the RAW file to Instagram or any other social media platform of your liking. Because the amount of data that image contains is so huge that lots of it aren’t noticeable when viewed on social media sites. Compressing the picture to a high-quality jpeg removes some of the information, but the picture looks almost the same to the naked eye. 

How File Compression Works: Lossless Compression

This type of compression works by reducing the file size to reconstruct the original file perfectly. Unlike lossy compression, it involves no loss of information. Instead, it essentially works by removing redundancy.

I think you will better be able to understand what it is if I give you an example. Let’s say you have 20 bricks stacked on top of each other: four black, ten orange, and six pinks. This stack is a simple way to illustrate those blocks, but this is not the only way. There’s another and a very better way to do so.

Instead of showing all 20 blocks, we can keep only one of each color while removing all the other. Then, if we use numbers to show how many bricks of each color there were, we’ve represented the same bit of information using just three bricks instead of twenty. Which is far fewer, don’t you agree?

This was just a simple example of how lossless compression works; I hope it helped. It stores the same information more efficiently by removing redundancy. Let me give another example; let’s say you have a file with the below string:

aaasssssqqqqqqqqq

Can “compress” to the following, much shorter form:

a3s5q9

This enables us to use six characters instead of seventeen, to represent the same data, which is quite a significant saving.

When to Use Lossy vs. Lossless Compression

Now that you know what Lossy and Lossless compression are, you might be thinking about when you should use one or the other. One thing that you should get into your mind is that there is no “better” form of compression—it all depends on what you’re using the files for.

In general, you should opt for the lossless one when an imperfect copy of the source material works for you and lossy when you need a perfect copy. Let’s go through another example to see how they can work in harmony.

Say that you were cleaning your room and just found your ancient CD collection and want to digitize it, so you have all your music on your laptop. When you rip your CDs, it makes sense to use a lossless format like FLAC. This lets you have a master copy on your laptop that’s as good as the original CD.

Later, perhaps you want to put some music on your mobile device so you can listen on-the-go. It probably does not matter to you if your music is in perfect quality or not on your mobile device so that you can convert the FLAC files to MP3. So, what you get is an audio file that’s still perfectly listenable but doesn’t take up as much space on your phone. The quality of the MP3 converted from the FLAC will be as good as if you’d created a compressed MP3 right from the original CD.

Which type of compression to opt for may also depend on the type of data represented in a file. Because PNG images use lossless compression, they offer small file sizes for images with lots of uniform space, like computer screenshots. However, you’ll notice that PNGs take up much more space when representing the jumble of colors in real-world photos.

Thank you for reading!