Home » Blog » Cryptographic Hash Functions

Cryptographic Hash Functions

2014-03-30 - Cryptographic Terminology, Cryptography, General, Security, Series

Introduction

In Certificates and the Chain of Trust I mentioned that a digital signature is often not signing the document itself but rather a digital digest of the document. In this article, I would like to investigate how you can generate such a digest and what properties a digest value needs to have to be usable for legally binding signatures.

Requirements for a Good Message Digest

If you use a digital message digest as the base of a signature, there are quite a few constraints that such a digest needs to fit in, to be of value. Let us look at what makes a good non-digital signature:

It is reasonably hard to duplicate the signature onto a different document that the one originally signed.
It is reasonably hard to alter the signed document without leaving obvious marks

These are the two properties that we are after for digital signatures too. If the signature is just an encrypted version of the document itself, encrypted by the private key of the signer, these properties are pretty obviously met. Assuming an adversary does not have possession of the private key they can neither alter an existing signed document nor sign a new one. However, if the signature is not on the document itself, but instead on something derived from that document, there are additional considerations that must be taken into account.

To check the validity of a digest base signature you first have to decrypt the encrypted digest using the public key of the signer. Afterwards you have to take the document and recalculate the digest. If the decrypted and the newly calculated values match, you would like to be able to assume that you have a valid signature. A problem arises if an adversary could change the document without altering the digest value or if they could create a new document that has the same digest value. If the adversary were the person asking you to sign a document in the first place, it would also be a problem if they could create two documents that have the same digest. This type of attack is easier to execute, as the adversary can change both documents until a digest match is found. Let me summarize: For a digital digest to be a good base for a digital signature, it must meet these requirements:

It should be infeasible to create two documents with the same digest value.
It should be infeasible to create a new document that matches the digest of an existing document.
It should be infeasible to alter an existing document without also changing its digest value.

Cryptographic Hash Function

A hash function is a function that turns a document into a fixed length binary value with the added requirement that the same document always results in the same value. Hash functions are in wide-spread use throughout all areas of software development. In SQL Server they are used for example in hash join, hash group and parallelism exchange operators.

Most hash functions are designed for speed and not necessarily for cryptographic use. A non-cryptographic hash function can for example be found in the SQL Server CHECKSUM function. It is fast to execute and any accidental change to the data the checksum was calculated over will likely result in a change of the checksum. However, that function does not provide any cryptographic strength and it is easy to create a new document that matches the checksum of a given document.

A cryptographic hash function is a hash function that produces digest values that have four additional requirements. Three of these properties have been listed at the end of the previous section. Additionally, the hash value cannot reveal any information about the hashed document, not even something as benign as the document type.

The fourth property is required as hash functions often are used to do one-way encryption for example for passwords. If you only store the cryptographic hash value of a password, no malicious user will be able to steal that password from you, provided the hash value does not leak any information about the hashed password.

Hash Functions in SQL Server

For cryptographic purposes, SQL Server implements the following hash functions: MD2, MD4, MD5, SHA, SHA1, SHA2_256 and SHA2_512. However, most of these are today considered insecure and should not be used. Currently only SHA2_256 and SHA2_512 provide cryptographic security. Both SHA2 algorithms were introduced with SQL Server 2012. In earlier SQL Server versions, you should use SHA1, but be aware of its shortcomings.

Summary

Hash functions provide a way to calculate a digital digest value of a document. Hash functions that additionally fulfill four cryptographic requirements are called cryptographic hash functions. Cryptographic hash functions can for example be used as the base of a digital signature. They also find widespread use in the storage of passwords.