Cabinet of curiosities: A bunch of cryptographic protocol oddities
In this short trivia article I will list a few unusual cryptographic design choices in some real applications/protocols. Specifically, I will focus on two types of “oddities”.
1) Unusual or weird design choices, maybe bordering voodoo/superstitions territory.
2) Choices that may have been very reasonable historically, but will be perceived as unusual for a reader in 2016 who has the benefit of hindsight.
I will not list design choices that have simply been broken, instead focusing on oddities that are maybe more interesting. Hopefully it will make you say WTF at least once. :-)
We have learned a lot over the years, and protocols designed by experts in the year 2016 are typically fairly sane, at least lacking in surprises. That was not always the case however, leading to choices that were made with a poorer understanding of cryptography engineering than what we have today.
The point of this article is not to point fingers or make fun of designs. It is simply to highlight design choices that may have made sense with the knowledge we had of crypto design at the time when the protocol was designed. Some of the choices may still be controversial.
MS-CHAP v2 is Microsoft's version of the CHAP protocol, used among others as an authentication option for PPTP. It implements a challenge-response mechanism that as a very high level works like this: The server sends the client a challenge message. The client hashes the combination of a challenge and a secret. The server verifies by doing the same (the secret is shared).
The protocol is described in RFC 2759 (so it’s quite old) and the main curiosity in this protocol is revealed in the function GenerateAuthenticationResponse. After (roughly) hashing the password/secret and the challenge, it ends with hashing in a magic 41 byte string called Magic2. So roughly it does the following, ignoring some detail.
What, you may ask, is Magic2? It’s the following string:
The reason for hashing in this string is not explained, but you can only guess that the protocol designer wanted to make the SHA function do more work, or be “more random”, by lengthening the input string enough to make it longer than 512 bits, the block size of the SHA-1 algorithm. An input longer than the block size makes SHA-1 do one more iteration in the hashing step, as explained by the string.
When setting up a TLS connection the client and the server need to construct keying material that is used as keys for various cryptographic primitives. A pseudo random function is used to construct new pseudo random strings from an original pseudo-random string. In TLSv1.2, this is simply done using SHA-256 in a specified way.
In TLSv1.1 and older, however, there were concerns which hash algorithm to use. The outcome? A function that uses both SHA-1 and MD5, XOR:ed together. (For completeness: there is another part of the TLS 1.1 protocol that instead use a MD5 and SHA-1 hash concatenated together).
The original rationale was that, if one of the hash functions is broken, there’s a “hedge” by still having hopefully one that is not broken. Is this reasonable? Most certainly. The decision to change the function to use SHA-256 only, in the design of TLSv1.2, was controversial. If you are interested in the discussion, see for example the discussion “[TLS] PRF in TLS 1.2” from 2006, on the TLS IETF mailing list.
The current version of the OpenPGP specification, as implemented in for example GPG, is mostly backwards compatible with older versions of the specification, which were designed in the olden days. This leads to a few problems with OpenPGP that are well known. I will list two of them.
#1. If you sign, but not encrypt, a message, then the signature does not include some of the metadata in the OpenPGP message. For example, there’s a metadata field that includes the proposed file name of the signed data. This field name can be tampered with and the signature check will still hold.
#2. In the original Pretty Good Privacy specification there were no support for integrity protecting messages. Nowadays it’s standard practice to not touch (decrypt) a message before the MAC has been verified. That was not the case in the 90:s. When the OpenPGP RFC 4880 spec was written, integrity checking was “bolted on” the old design. It works like this:
You hash the plaintext with SHA-1. Then you append the SHA-1 hash to the end of the plaintext. Then you encrypt the combination of the plaintext and the hash. So it’s “MAC”-then-Encrypt.
The RFC is pretty apologetic about this design choice. I quote RFC 4880 section 5.13 (here “MDC” means Modification Detection Code):
To recap, IPSEC supports a few different protocols, each with their own cryptographic goal. One of the protocols is called ESP, Encapsulated Security Protocol. It can provide confidentiality and (optional) data origin authentication for the encapsulated packet. Another protocol is called AH, for Authentication Header. It provides only data origin authentication for the payload.
The flexibility gotchas of IPSEC are many. The administrator can configure his deployment to use either AH or ESP for data origin authentication. You could also skip authentication altogether, as the ESP packet has support for not using this. To further complicate things, AH and ESP could be applied in any order: first AH then ESP. Or first ESP then AH. To even further complicate things, IPSEC then allows for both encrypt-then-MAC and MAC-then-encrypt configurations.
This does not even mention the various cryptographic primitives each of the protocol supports, that give even more choice during setup.
IPSEC is, arguable, an instance of the problem of giving too many knobs to the user, and very little guidelines how to not shoot yourself in the foot. Fortunately as of 2016 there are lots of advice on how to configure IPSEC by the vendor implementations and other documentation (see for example RFC 7321 from 2014), but it can be argued it’s still problematic that the protocol offer so many options. Newer protocols typically go the other way: allow as little room for screwing up as possible.
Is this a security issue? Of course not. And it may or may not lead to some performance improvements in some cases. It does, however, lead to to less unified designs, and maybe some bikesheds.
1) Unusual or weird design choices, maybe bordering voodoo/superstitions territory.
2) Choices that may have been very reasonable historically, but will be perceived as unusual for a reader in 2016 who has the benefit of hindsight.
I will not list design choices that have simply been broken, instead focusing on oddities that are maybe more interesting. Hopefully it will make you say WTF at least once. :-)
We have learned a lot over the years, and protocols designed by experts in the year 2016 are typically fairly sane, at least lacking in surprises. That was not always the case however, leading to choices that were made with a poorer understanding of cryptography engineering than what we have today.
The point of this article is not to point fingers or make fun of designs. It is simply to highlight design choices that may have made sense with the knowledge we had of crypto design at the time when the protocol was designed. Some of the choices may still be controversial.
Oddity: Unexplained “strengthenings”
Sometimes you come across a design that does some unusual construct to make the design “stronger”, without explaining why the construct is needed or how it works. This kind of cargo culting is especially common in non-cryptographers hobby designs, but sometimes it gets deployed broadly. One interesting example of this is in MS-CHAP v2.MS-CHAP v2 is Microsoft's version of the CHAP protocol, used among others as an authentication option for PPTP. It implements a challenge-response mechanism that as a very high level works like this: The server sends the client a challenge message. The client hashes the combination of a challenge and a secret. The server verifies by doing the same (the secret is shared).
The protocol is described in RFC 2759 (so it’s quite old) and the main curiosity in this protocol is revealed in the function GenerateAuthenticationResponse. After (roughly) hashing the password/secret and the challenge, it ends with hashing in a magic 41 byte string called Magic2. So roughly it does the following, ignoring some detail.
Digest = SHA1(password || challenge || Magic2)
What, you may ask, is Magic2? It’s the following string:
"Pad to make it do more than one iteration"
.The reason for hashing in this string is not explained, but you can only guess that the protocol designer wanted to make the SHA function do more work, or be “more random”, by lengthening the input string enough to make it longer than 512 bits, the block size of the SHA-1 algorithm. An input longer than the block size makes SHA-1 do one more iteration in the hashing step, as explained by the string.
Oddity: Hedging against future breaks (in hindsight)
Being conservative is often good when designing cryptographic protocols. There are many cases when protocols and applications decide to use two different primitives in the same class to still offer some security in case one of the primitives is broken. That’s often a good idea, but may lead to surprises down the road for people who have the benefit of hindsight. One such surprise can be found in TLS version 1.1 and older, for people who read the spec today (or in recent years).When setting up a TLS connection the client and the server need to construct keying material that is used as keys for various cryptographic primitives. A pseudo random function is used to construct new pseudo random strings from an original pseudo-random string. In TLSv1.2, this is simply done using SHA-256 in a specified way.
In TLSv1.1 and older, however, there were concerns which hash algorithm to use. The outcome? A function that uses both SHA-1 and MD5, XOR:ed together. (For completeness: there is another part of the TLS 1.1 protocol that instead use a MD5 and SHA-1 hash concatenated together).
The original rationale was that, if one of the hash functions is broken, there’s a “hedge” by still having hopefully one that is not broken. Is this reasonable? Most certainly. The decision to change the function to use SHA-256 only, in the design of TLSv1.2, was controversial. If you are interested in the discussion, see for example the discussion “[TLS] PRF in TLS 1.2” from 2006, on the TLS IETF mailing list.
Oddity: Protocols designed in the 90:s have weaknesses due to iterative improvements on a legacy base
Highlighting problems with OpenPGP is maybe beating a dead horse, but it highlights well the generic problem of bolting on security on a 90:s era protocol.The current version of the OpenPGP specification, as implemented in for example GPG, is mostly backwards compatible with older versions of the specification, which were designed in the olden days. This leads to a few problems with OpenPGP that are well known. I will list two of them.
#1. If you sign, but not encrypt, a message, then the signature does not include some of the metadata in the OpenPGP message. For example, there’s a metadata field that includes the proposed file name of the signed data. This field name can be tampered with and the signature check will still hold.
$ echo "hello world" > helloworld.txt
$ gpg --sign helloworld.txt
$ sed 's/helloworld.txt/tamperedxx.txt/' helloworld.txt.gpg >
helloworld.txt.gpg.tampered
$ gpg --verify helloworld.txt.gpg.tampered
... gpg: Good signature from …
#2. In the original Pretty Good Privacy specification there were no support for integrity protecting messages. Nowadays it’s standard practice to not touch (decrypt) a message before the MAC has been verified. That was not the case in the 90:s. When the OpenPGP RFC 4880 spec was written, integrity checking was “bolted on” the old design. It works like this:
You hash the plaintext with SHA-1. Then you append the SHA-1 hash to the end of the plaintext. Then you encrypt the combination of the plaintext and the hash. So it’s “MAC”-then-Encrypt.
The RFC is pretty apologetic about this design choice. I quote RFC 4880 section 5.13 (here “MDC” means Modification Detection Code):
It is a limitation of CFB encryption that damage to the ciphertext
will corrupt the affected cipher blocks and the block following.
Additionally, if data is removed from the end of a CFB-encrypted
block, that removal is undetectable. (Note also that CBC mode has
a similar limitation, but data removed from the front of the block
is undetectable.)
The obvious way to protect or authenticate an encrypted block is
to digitally sign it. However, many people do not wish to
habitually sign data, for a large number of reasons beyond the
scope of this document. Suffice it to say that many people
consider properties such as deniability to be as valuable as
integrity.
OpenPGP addresses this desire to have more security than raw
encryption and yet preserve deniability with the MDC system. An
MDC is intentionally not a MAC. Its name was not selected by
accident. It is analogous to a checksum.
Despite the fact that it is a relatively modest system, it has
proved itself in the real world. It is an effective defense to
several attacks that have surfaced since it has been created. It
has met its modest goals admirably.
Consequently, because it is a modest security system, it has
modest requirements on the hash function(s) it employs.
Oddity: Too many knobs leads to difficulty reasoning about security
This entry is maybe less exciting, and more problematic, than the other ones, but are still worth discussing. A design feature of IPSEC is that it allows a lot of flexibility. Maybe a bit too much flexibility. With great power comes great responsibility, and it’s quite easy to configure IPSEC in a potentially problematic way.To recap, IPSEC supports a few different protocols, each with their own cryptographic goal. One of the protocols is called ESP, Encapsulated Security Protocol. It can provide confidentiality and (optional) data origin authentication for the encapsulated packet. Another protocol is called AH, for Authentication Header. It provides only data origin authentication for the payload.
The flexibility gotchas of IPSEC are many. The administrator can configure his deployment to use either AH or ESP for data origin authentication. You could also skip authentication altogether, as the ESP packet has support for not using this. To further complicate things, AH and ESP could be applied in any order: first AH then ESP. Or first ESP then AH. To even further complicate things, IPSEC then allows for both encrypt-then-MAC and MAC-then-encrypt configurations.
This does not even mention the various cryptographic primitives each of the protocol supports, that give even more choice during setup.
IPSEC is, arguable, an instance of the problem of giving too many knobs to the user, and very little guidelines how to not shoot yourself in the foot. Fortunately as of 2016 there are lots of advice on how to configure IPSEC by the vendor implementations and other documentation (see for example RFC 7321 from 2014), but it can be argued it’s still problematic that the protocol offer so many options. Newer protocols typically go the other way: allow as little room for screwing up as possible.
(Minor) Oddity: Status quo vs personal choice
Daniel J Bernstein is arguably one of the most influential cryptographers (and cryptographic engineers) working for the internet today. He also seems to be the only one who has a soft spot on using little endian representations of integers in his cryptographic designs. This in a world that is largely dominated by big endian protocols.Is this a security issue? Of course not. And it may or may not lead to some performance improvements in some cases. It does, however, lead to to less unified designs, and maybe some bikesheds.