We've explored how cryptography can be used to achieve confidentiality through encryption, so let's turn to data integrity. Integrity is one of the core security requirements in the cyberspace: we need to know whether data has been tampered with and altered.
The basic solution is the cryptographic hash, which uses a hashing algorithm to create an image of a piece of data. If the hashing algorithm is strong enough and is doing its job properly, the image will be unique to the piece of data that was inputted into the algorithm. Therefore, if the piece of data is changed, the alterations will be detected when it is rehashed, because the altered data will produce its own unique hash. If the hashes are different, it means the piece of data has been altered. In that case, the alteration has been detected, then we can take whatever actions we deem necessary to deal with the problem.
Here are some more interesting features of cryptographic hashes:
The algorithm will ingest any data of arbitrary length, to always produce a hash of the same length. So hashing all of the works of Shakespeare in one go will produce a hash of the same length as one for this blog. This means that you cannot determine the message length by looking at the hash.
A hash is a one-way function, which means that the original data cannot be reverse engineered from the hash.
They are "collision" resistant. There's more to this than I'll explain here, but, in a nutshell, you need to be confident that the algorithm will always produce unique hashes for unique data, so that two pieces of unique data do not produce the same hash.
They can be combined with other forms of cryptography, to provide other security outcomes, such as enhanced forms of data integrity like data origin authentication and non-repudiation. Digital Signatures utilise hashes, for example.
Cryptocurrencies also utilise hashes within their make-up. We wouldn't have Bitcoin without them.
When used as password protection, the hash transitions from the data integrity zone of security into a form of confidentiality.
Now, everything I've said about hashes being one-way and collision resistant assumes a strong hashing algorithm. Unfortunately, the world of hashing has been blighted by some that are not. Therefore, if we apply this knowledge to the central focus of this website - the twinning of security law and operational security and my hypothesis that you need to understand both to deal with either - then you'll (hopefully) appreciate the significance of understanding why you need to know if a hashing algorithm is strong or weak: your security posture could be undermined.
For instance, you don't want weak password hashes do you? That can lead you into a heap of operational and legal trouble. Likewise, you don't want passwords to be stored "in the clear", which means without being hashed. A system controller should never have access to passwords, so if you ever read about (say in a breach notification letter) that your password has been compromised, it's highly likely that it was being stored in the clear, constituting a significant failure of operational security and breach of any corresponding legal duties.
And if a weak hash is used for data origin authentication or non-repudiation, your trust in others could be wholly undermined. I'll examine these issues in another blog soon.