Mastodon# How much data can we store in 1 byte, kilobyte, megabyte, gigabyte, etc.?

### A Lonely "Bit"

### Bits Strong Together

### Why is 1 KB 1024 Bytes and not 1000 Bytes?

### How much data can 1 KB store?

### What about 1 MB?

### What about 1 GB?

### What about 1 TB?

### What about 1 PB?

### How much storage do we need to store the whole humanity literature?

### There's more

### The 0xkishan Newsletter

Everything in this universe is made out of some basic unit.

June 11, 2023

In this article, we'll try to visualize the storage system of a modern computer. After reading this article, you'll have an idea to compare how much storage is with day-to-day life.

Everything in this universe is made out of some basic unit.

- In Biology, we call it a
**cell**. A cell is the basic unit of life in biology. - In chemistry, we call it an
**atom**. An atom is defined as the basic unit of a chemical element. - Similarly, in computers, we have a
**bit**.

An Analog Signal Converted into a Digital Signal

A bit is a binary digit, and it is the most basic unit of information in computing. It can hold only one of the two possible values: either 0 or 1, which can represent different states depending upon the use case, such as on/off, true/false, yes/no, etc.

You can implement a bit with a two-state device, such as a switch, a transistor, a magnet, a light bulb, etc.

It is so small that you can't even store a single character, let's say 'a' in it. We'll need more bits to represent a character.

A typical character usually takes 1 byte. A byte is a collection of 8 bits. You might ask why storing a simple character as 'a' takes 8 bits.

Since our basic unit of storing data is bits, everything must be translated into binary before comprehending it on the computer.

A character such as 'a' is first translated into a number called its ASCII alternative. Each character in English literature is given a unique number of representatives.

E.g., our 'a' is represented as 97.

Let's convert the number 97 into its binary equivalent. It'll be01100001. How many bits can you see in it? Eight right?

Standard ASCII-encoded data has unique values for 128 alphabetic, numeric, or special additional characters and control codes.

Any English character can be represented with just the 8 bits (a byte). Thus, you'll need 8 bits to store a single character.

As discussed above, you'll need 8 bits (or a byte) to save a character. With a byte of storage, you can have a total of 256 (28) possible states. Let me list down a few states:

00100000 (32) = ' ' (SPACE)

00100001 (33) = '!' (EXCLAMATION MARK)

00110101 (53) = '5' (DIGIT 5)

Okay, if I have to store my name **"Kishan Kumar"** how many bits or bytes will I require?

Let's see; my name contains 12 characters (including space). So, I'll need 12 bytes (=96 bits).

12 bytes is nothing if we look at the amount of available storage, but that wasn't true a few decades back. Back in 1965, the cost of 1 byte was about $1.25. That's insane. In order to simply store my name on the computer, I'll have to invest $15.

If you compare it with today's (2023) price, the cost of 1 byte is $0.00000003. In total, 0.00000003 * 12 = $0.00000036. Insane.

1 KB (KiloBytes) is equivalent to 1024 bytes. Note, in our usual metric systems, if we say how many grams are in a Kilogram, the answer is 1000. **But why is it 1024 here?**

In the realm of computer science, we deal in binary which is based on the power of 2.

The metric system uses powers of 10 and prefixes like kilo, mega, giga, etc. A kilogram is equal to 1000 grams because 10³ = 1000.

In the usual metric system, we have

10⁰ = 1,

10¹ = 10,

10² = 100,

10³ = 1000

10¹ = 10,

10² = 100,

10³ = 1000

Whereas in a binary system, we have

2⁰ = 1,

2¹ = 2,

.

2³ = 8

.

2¹⁰ = 1024

2¹¹ = 2048

.

2¹ = 2,

.

2³ = 8

.

2¹⁰ = 1024

2¹¹ = 2048

.

There is no **1000** because all the computation in computer systems is based on binary.

However, this convention needs to be clarified with the metric system, which uses powers of 10 and prefixes like kilo, mega, giga, etc. To avoid this ambiguity, a new standard was introduced in 2000 that uses different prefixes for binary units: kibi (Ki), mebi (Mi), gibi (Gi), etc. According to this standard, a kibibyte (KiB) equals 1024 bytes, while a kilobyte (kB) equals 1000 bytes. However, this standard has yet to be widely adopted, and many people still use kilobytes to mean 1024 bytes.

1 KB = 1024 bytes = 1024 characters ~ 180 words (if we take the average word length in English to be ~6 characters).

This is what a 180 words para looks like:

She had always loved the sound of rain. It was soothing and calming, like a gentle lullaby. She liked to watch the drops fall from the sky and splash on the windowpane, creating tiny ripples and patterns. She liked to listen to the rhythm of the raindrops on the roof, like a drum beat that matched her heartbeat. She liked to smell the fresh and earthy scent of rain, like a perfume that cleansed the air. She liked to feel the rain on her skin, like a kiss that made her shiver.

Rain was her favorite weather, and she wished it would rain more often. She felt happy and peaceful when it rained, as if nothing could go wrong. She felt free and alive when it rained, as if she could do anything. She felt cozy and warm when it rained, as if she was wrapped in a blanket. Rain was her friend, and she loved it with all her heart.

A megabyte is 1,048,576 (2²⁰) bytes or 1,024 kilobytes. This will be equivalent to 800 pages or ~ 200,000 words. You can think of Algorithms by CLRS; it contains ~ 1200 pages.

A gigabyte is 1,073,741,824 (2³⁰) bytes. 1,024 megabytes, or 1,048,576 kilobytes. It can fit:

- About 230 music tracks, assuming each track is 4 minutes long and encoded at 128 kbps.
- About 600 photos, assuming each image is 5 megapixels and compressed as JPEG.
- About 320 minutes of video, assuming the video is standard definition (480p) and encoded at 2 Mbps.
- About 19200 pages of Word documents, assuming each page has 500 words and no images.
- About 5 hours of online gaming, assuming the game uses about 3 MB per minute.

A terabyte is 1,099,511,627,776 (2⁴⁰) bytes, 1,024 gigabytes, or 1,048,576 megabytes. It can fit

- 916,259,689 pages of plain text (1,200 characters).
- 4,581,298 books (200 pages or 240,000 characters). The average number of books in a public library in the U.S. is about 10,000 titles.
- 655,360 web pages (with a 1.6 MB average file size).
- 349,525 digital pictures (with a 3 MB average file size).
- 262,144 MP3 audio files (with 4 MB average file size).
- 1,613 650 MB CDs.
- 233 4.38 GB DVDs.
- 40 25 GB Blu-ray discs.

A petabyte is 1,125,899,906,842,624 (2⁵⁰) bytes, 1,024 terabytes, 1,048,576 gigabytes, or 1,073,741,824 megabytes. It can fit

- About 230,000,000 music tracks, assuming each track is 4 minutes long and encoded at 128 kbps.
- About 600,000,000 photos, assuming each image is 5 megapixels and compressed as JPEG.
- About 320,000,000 minutes of video, assuming the video is standard definition (480p) and encoded at 2 Mbps.
- About 19,200,000,000 pages of Word documents, assuming each page has 500 words and no images.
- About 5,000,000 hours of online gaming, assuming the game uses about 3 MB per minute.

You get the point, right? The critical question is:

This is a tricky question because there are a lot of factors, such as the rate at which the books were published; the rate cannot be constant due to the world wars, plague, and other external events.

One possible way to estimate the amount of storage we need is to use the number of books published worldwide as a proxy for the whole of human literature. According to one source (theconversation.com), about 2.2 million books were published worldwide in 2013. Assuming that each book has about 300 pages and each page has about 500 words, we estimate that the total number of words in all books published in 2013 is about 330 billion words.

For simplicity, let us assume that we use ASCII encoding, which uses one byte (8 bits) to store each character. Assuming that the average word length in all languages is five characters, we can estimate that the average number of bytes per word is 5.

To find out how many bytes are needed to store all words in all books published in 2013, we need to multiply 330 billion words by 5 bytes per word. This gives us about 1.65 trillion bytes or 1.65 terabytes (TB). This is a very rough estimate.

However, this estimate only covers one year of book publishing. If we want to include all books ever published in human history, we need to multiply this number by the number of years since the invention of writing. According to one source (researchgate.net), writing was invented around 3200 BC, meaning writing has existed for about 5200 years. Assuming that the rate of book publishing has been constant throughout history (which is not true), we can multiply 1.55 TB by 5200 years to get about 8 petabytes (PB) of data.

Again, this is still a rough estimate and needs to account for many factors, such as changes in book publishing rates, languages, formats, genres, etc. It also does not include other types of literature, such as oral traditions, poems, songs, scripts, etc. Therefore, the amount of storage we need to store the whole humanity literature may be much higher or lower than this estimate.

Exabyte (EB): An exabyte is 1,152,921,504,606,846,976 (2⁶⁰) bytes

Zettabyte (ZB): A zettabyte is 1,180,591,620,717,411,303,424 (2⁷⁰) bytes

Yottabyte (YB): A yottabyte is 1,208,925,819,614,629,174,706,176 (2⁸⁰) bytes

I have refrained from giving examples because these numbers are anyway incomprehensible.

I hope this article helped you understand the basics of computer storage. Thank You.

. . .

Subscribe to the newsletter to learn more about the decentralized web, AI and technology.