Updated date:

Unblocking the Blockchain: Hashing

Heng Kiong teaches Information Technology, including business analytics and management information systems, at a tertiary institute.

What is Hashing?

In two earlier articles—on public key cryptography and digital signatures—we learned that digital signatures prove that a message originates from a specific person and no one else. Digital signatures ensure message authentication as they can be used to assure the participants of a transaction that the information originated from the signer and can be trusted.

In this article, we are going to explore a second cryptography concept that is essential in a blockchain network, known as "hashing."

Hashing is the process of applying the hash function. Hashing is a confirmation that the output from a hash function has not been tampered with in any way. Hashing transforms a string of characters into a value that uniquely represents the original string. Bitcoin, for example, uses SHA-256 algorithm to take an input and turn it into a fixed size of 256-bit (32-byte) output. For example,

BLOCKCHAIN - dffdca1f7dd5c94afea2936253a2463a26aad06fa9b5f36b5affc8851e8c8d42

blockchain - ef7797e13d3a75526946a3bcf00daec9fc9c9c4d51ddc7cc5df888f74dd434d1

SHA stands for Secure Hash Algorithm. Blockchain uses 256 bits of information to represent its current state at any given time.

Quiz

For each question, choose the best answer. The answer key is below.

  1. A small change in the input should yield a completely different output.
    • True
    • False
  2. Regardless of the length of the input, the same Hash function should yield the same output length.
    • True
    • False
  3. Hashing the same input string should always yield the same output.
    • True
    • False
  4. Given the hashed output, it should be extremely difficult to guess the input.
    • True
    • False
  5. Two different input strings should not generate the same output.
    • True
    • False

Answer Key

  1. True
  2. True
  3. True
  4. True
  5. True

An Analogy

Let us attempt to use bar codes to explain the concept here. A bar code is an image of bars and spaces that is used to identify a particular product. For an example, in a supermarket, a bar code uniquely represents the item which itself is affixed to. However, it is difficult to guess the products by just looking at the bar codes.

Bar codes - Analogy of Hashing

Bar codes - Analogy of Hashing

Verifying the Checksum of Files

In computer security systems, Hashing is used to ensure that transmitted messages have not been tampered with.

Hashing as used in verification

Hashing as used in verification

Hashing used in verifying the integrity of a downloaded file

Hashing used in verifying the integrity of a downloaded file

In situations when you are downloading a file, you may be able to see the checksum values being included, so you can verify if the downloaded file has been corrupted or not. Example:

Download link

Filename: Blockchain_FINAL_040518.mp4

SHA256: 6528d13bc80d1e4603f63face7fed28b462d7eaa735f8f5ccdeb94b914723269

Verifying

Hashing allows one to ensure that the file is identical and that no corruption has occurred during the download.

Once you have downloaded the file, you can verify if the downloaded file and the published checksums match.

In Windows

Use the built-in CertUtil utility

Open the command prompt in Windows

Navigate to the folder of the downloaded file

Run the following command:

C:\>CertUtil -hashfile / SHA256


In Macs

Open Terminal
Run the following command:

shasum -a -256 /Users/<path name>/videos/Blockchain_FINAL_040518.mp4

Command Prompt - DIRectory

Command Prompt - DIRectory

Command Prompt - Hash 256

Command Prompt - Hash 256

Check the SHA-256 hash values to match against the one from the original file.

Characteristics of a Good Hash Function

Hashing has the following characteristics, which makes it suitable for Blockchains.

  1. Difficult to predict what the hash of any input will be until you calculate it.
  2. Computationally Efficient
  3. Collision Resistant – Difficult to find two inputs that match to one same output
  4. Hides Information - Difficult to glean anything useful about the input given the output.
  5. Well Distributed – The output should look random

Hashing in the Blockchain

Blockchain keeps an entire list of transactions by linking together a series of timestamped data records that are immutable. There is no central authority in a Blockchain. Participating parties in a blockchain network need an assurance of trust that a record has not been tampered with or modified in an unauthorised way.

Hashing provides that. It represents the current state of the transactions that have taken place so far.

Hashing in a Blockchain

Hashing in a Blockchain

What's Next

In Blockchain, transactions are linked together in a chronological manner to form a continuous chain of blocks. Hashing provides the security of the data. We will look at how Hashing is being used in Blockchain in the next article.

Hashing is also used in Bitcoin where miners are required to find the hash of the block header to start with a certain number of zeros. We will cover this later.

Article Navigation

Check out the full series:

Part 1 - What is Blockchain?

Part 2 - Centralised vs. Decentralised Databases

Part 3 - Digital Signatures

Part 4 - Private-key Cryptography

Part 5 - Public-key Cryptography

Part 6 - Cryptography and Digital Signatures

Part 7 - Hashing

This article is accurate and true to the best of the author’s knowledge. Content is for informational or entertainment purposes only and does not substitute for personal counsel or professional advice in business, financial, legal, or technical matters.

© 2018 Heng Kiong Yap