Saturday, July 28, 2012

Fun with the TLS handshake

This article is about the TLS handshake in general, with a focus on how cryptographic signing is used in the protocol. Specifically, the article will attempt do to the following:
  1. Explain how the TLS handshake works, using Python code.
  2. Highlight how some parts of the protocol allow for signing short amount of client-supplied data by the server certificate.
  3. ... and abuse the above parts of the protocol, and common implementations of the protocol, to build poor-mans trusted time-stamping.
The companion code to this article is the little project "toytls" hosted at Github, which implements the client part of the handshake and some surrounding utilities to accomplish point 2 and 3 above.

See the toytls page at github for the code.

An overview of the TLS handshake

TLS, and it's predecessor SSL, is a protocol for secure (encrypted, authenticated) communication over an untrusted network. The "TLS handshake" is the initial part of the protocol, where a client and a server negotiate how (if) to accomplish the goal of secure communication.

The handshake consists of multiple messages being sent back and forth the server and client, with the ultimate goal of: a) allowing the client to verify the authenticity of the server, usually using cryptographic certificates, b) allowing the server to verify the client and c) to securely agree on a method to proceed with the bulk transfer of data.

The handshake typically look like this:

>>> ClientHello
<<< ServerHello
<<< Certificates
<<< ServerHelloDone
>>> ClientKeyExchange
>>> ChangeCipherSpec
>>> Finished
<<< ChangeCipherSpec
<<< Finished

This means that the client first sends a message ClientHello to the server, and will get the response ServerHello back, etc. The handshake ends with both client and server sending Finished. After that point, a secure channel has been set up for the transfer of data.

Cryptographic primitives and cipher suites

TLS has the concept of a "cipher suite". This is a predefined list of cryptographic primitives in different classes (encryption, hashing, authentication etc) to use in the handshake and communication. In this article, we will concern ourselves with three cipher suites:

TLS_RSA_WITH_RC4_128_SHA = 0x0005
TLS_DHE_RSA_WITH_AES_128_CBC_SHA = 0x0033
TLS_ECDHE_RSA_WITH_RC4_128_SHA = 0xc011

The first one, suite TLS_RSA_WITH_RC4_128_SHA (which has id 5), is implemented and supported by almost all clients and servers. The last two are not as common (the servers at Google and Facebook support one each, but not both) and are selected specifically because they have interesting signing behavior, which will be discussed further below.

In English, the three cipher suites above mean:
  • A cipher suite using RSA for secure key exchange, RC4 for bulk data encryption and SHA1 as a hash.
  • A cipher suite using Diffie-Hellman for secure key exhange, RSA signatures for authentication, AES-128 in CBC mode for bulk encryption and SHA1 as hash.
  • A cipher suite using Elliptic Curve Diffie-Hellman for secure key exchange, RSA signatures for authentication, RC4 for bulk data encryption and SHA1 as hash.
As can be seen, a full implementation of one of the versions of TLS requires a full arsenal of different cryptographic primitives (the three examples given are just a small subset of all cipher suites specified in the RFC:s). For the remainder of this article, we will mostly concern ourselves with the first suite, and only concern
ourselves with the second suit to implement the "clever hacks" mentioned in the very beginning of this article.

To implement a handshake using TLS_RSA_WITH_RC4_128_SHA, you need, of course, an implementation of RSA, RC4 and SHA1. In addition, TLS version 1.1, which is by far the most common version on the Internet today, requires an implementation of MD5 which is used for authenticating the handshake itself. This is not a requirement for TLS version 1.2, which use SHA-256 instead.

For practical use of RSA, the cipher suites mentioned employ the PKCS1v5 padding schema, which is a method padding a byte-string securely before encrypting it.

In toytls, the cryptographic primitives are implemented in toytls/rsa.py and toytls/rc4.py.

Message structure

To get a general feel for the TLS handshake we will dissect a sample ClientHello message, being sent from the client to the server. A simple ClientHello can look like below, with line breaks and comments added for readability.

16 03 02 00 31 # TLS Header
01 00 00 2d # Handshake header
03 02 # ClientHello field: version number (TLS 1.1)
50 0b af bb b7 5a b8 3e f0 ab 9a e3 f3 9c 63 15 \
33 41 37 ac fd 6c 18 1a 24 60 dc 49 67 c2 fd 96 # ClientHello field: random
00 # ClientHello field: session id
00 04 # ClientHello field: cipher suite length
00 33 c0 11 # ClientHello field: cipher suite(s)
01 # ClientHello field: compression support, length
00 # ClientHello field: compression support, no compression (0)
00 00 # ClientHello field: extension length (0)

All TLS messages have the TLS header which consists of a content type (first byte), version number (second two bytes) and a length (last two bytes). The content type describes the type of the message, and is 0x16 for the plaintext handshake and 0x17 för encrypted application data.

The handshake messages have a second handshake header describing the type of the handshake message (ClientHello, ServerHello etc). The handshake header also has a length field.

Following the handshake header is payload data specific to the handshake message. For ClientHello, this field contains some random data used for encryption purposes, a list of cipher suites supported by the client, information about compression support and optionally support for TLS extensions, which will not be covered in this article.

It is worth mentioning that, although the ChangeCipherSpec message is sent in the handshake procedure, it is not considered a handshake message and thus has a different content type (0x14) and does not have the handshake header. All the other messages mentioned above are handshake messages with a handshake header and a handshake content type.

Error handling

If something goes wrong during the handshake, the server can send an Alert message, which has content type 0x21. If you send incorrect TLS messages to the server (such as implementing authentication incorrectly, using incorrect decryption keys, attempting to use unsupported features, or otherwise implementing the protocol wrong)
you will get alert messages indicating some kind of failure.

In this article we will mostly ignore these messages. Just be aware that if you play around with toytls and change messages you will surely get these messages from the server.

The handshake from above

In this section of the article we will try to ignore the details of how the different TLS messages are marshalled, and instead focus on the content of each message. If you are interested in the bits and bytes, please have a look at the toytls source code and/or look at Wireshark output. In the description below, we will focus on the cipher suite TLS_RSA_WITH_RC4_128_SHA. Other cipher suites may have
slightly different message contents.

>>> ClientHello

Client says it supports TLS_RSA_WITH_RC4_128_SHA, no compression, and no extensions. It sends some random data used for cryptographic purposes.

<<< ServerHello

Server responds with some random bytes of it's own, also says that it accepts the cipher suite and compression option the client suggested. The server also gives a session ID which can be used if the client wants to reuse the TLS connection later. We will skip over the session ID and sessions in this article.

<<< Certificates

Server sends a list of certificates to the client (for HTTPS, these are the certificates shown in the web browser when the website is opened).

<<< ServerHelloDone

Server sends a message indicating that the client should continue with the handshake.

>>> ClientKeyExchange

The client generates a 48 byte random encryption key called the "pre master secret" and encrypts that with the public key in the server certificate. This encrypted key is sent to the server.

>>> ChangeCipherSpec

The client sends the control message ChangeCipherSpec indicating that everything from now on will be encrypted.

>>> Finished

The client sends an encrypted message that contains a hash of all the messages in the handshake sent and received. This makes it possible to detect tampering. This will be described further below.

<<< ChangeCipherSpec

The server sends the control message ChangeCipherSpec indicating that everything from now on will be encrypted.

<<< Finished

The server responds with a Finished message on it's own, sending an encrypted hash of the handshake messages.

Cryptography in the handshake

As mentioned above the Finished message contains a hash of the handshake sequence. For TLS 1.1, this hash is a slightly unusual construct: it is the concatenation of an MD5 and a SHA1 hash. The hashed data is the content of the handshake messages, without the TLS header.

This means that, when the client constructs ClientHello, it will also update the MD5-SHA1-hash with the content of the message. When it receives ServerHello, it will also update the hash, so the client and server can compare hashed in the Finished messages.

In addition to the MD5-SHA1-hash, TLS 1.1 also use a rather peculiar pseudo-random function (PRF) based on the two hashes. This pseudo-random function is used to generate keying material for the cryptographic primitives and MAC:s. The PRF is a combination of HMAC-MD5 and HMAC-SHA1. Please see toytls/hash.py for details.

When the cipher suite is TLS_RSA_WITH_RC4_128_SHA, the PRF is seeded with the pre-master secret, client random and server random data to generate a master secret. This master secret is then seeded to the PRF, again together with the random data, to create keying material. In this case, the keying material is for the RC4 stream cipher and HMAC-SHA1.

Both sides (client/server) hold encryption/MAC contexts for itself and the other side, so the client in our case has an RC4 encryption key for both the client and server.

Demonstrating the handshake

In the toytls project, please try out the script toytls-handshake.py. This will, given a server and port, complete a TLS 1.1 handshake.

Fun with signatures

With the cipher suite TLS_RSA_WITH_RC4_128_SHA, the handshake is authenticated because the pre master secret is encrypted with the public key in the server ceritificate, making sure only the holder of the private key can decrypt it. From the client side, this is some proof the server is valid.

When the cipher suit sets up a secure channel with Diffie Hellman, this kind of authentication is lacking. Diffie Hellman is anonymous and anyone could impersonate the server. To mitigate this, the TLS cipher suites using Diffie Hellman use RSA or some other algorithm for signing the handshake and encryption parameters. In TLS jargon, this is known as "Ephemeral Diffie Hellman", and is what DHE stands for in
the cipher suite names.

To inform the client of this signature, an extra message is introduced in the TLS handshake, known as ServerKeyExchange. This is sent between Certificate and ServerHelloDone. The signature is over the client and server random data, and the encryption parameters.

This means that a client can, using the client random field (as sent in ClientHello and used for encryption purposes), get up to 32 bytes of data signed by the server ceritificate. This is not completely useful on it's own. However, the TLS specification says that the first part of the random fields should be a timestamp.

Poor mans trusted timestamping

Trusted timestamping is cryptographically associating a timestamp with some data. Typically, you have a document you want to assert you wrote at a certain date. You take a hash of the document, send it to a trusted third party, and have them sign the hash together with a timestamp.

By using the client random field in the TLS handshake, together with a Diffie Hellman cipher suite, and using the fact that TLS implementations contain a timestamp from the server, you can use Google or Facebook as a trusted third party for timestamping up to 32 bytes of data, sufficient for for example a SHA-256 hash.

In the toytls project there are two scripts, toytls-sign.py and toytls-verify.py.

$ cat statement
In this statement I claim that I wrote the following documents the year 2012:

cae3614264895c0201525ec7efff4ca6bb34dfc2 toytls/x509.py
f087da9fe9ad13e8064e6ad951b6aac8c3d54799 scripts/toytls-sign.py

Cheers
be@bjrn.se

$ sha256sum statement
08c247a658bcfe4668d853192dfe9a27c4f7bbc75ca6bc567fdc4726b1628ee8 statement

$ sha256sum statement | awk '{print $1}' |
python -c "import sys; sys.stdout.write(sys.stdin.read().strip().decode('hex'))" |
PYTHONPATH=. scripts/toytls-sign.py www.google.com 443 signed_statement
Signing message:
08c247a658bcfe4668d853192dfe9a27c4f7bbc75ca6bc567fdc4726b1628ee8

Bytes signed:
08c247a658bcfe4668d853192dfe9a27c4f7bbc75ca6bc567fdc4726b1628ee8
5013fa565da4ee2d51c9441cf307578161d1f631a3b4784f193bdd7e9d55b598
03001741049c096ff72fe6a7a1bbc5227a7b9806ab0d12129212a3e700138070
42e35fd8f60efc9cda1ecd9bbf61464a179299b43c3cf195956eedd635a7f859
8091910bf3

Signature:
7c7e3ef5aeea49674e3311112b503c6cfe8149810d5615392d0405939667bf15
95a8c9693cc8d4105a52c85615e7132467757939f72f01354a74882f59463e4d
d76b7eb4ec0de9b6922e2fc3e74336eb0ae619f90f53a2384a1465970a11a9d5
66afd335d3ae9cb2e8f7fd757d5cb5fad530923d29b3df195a963ef699711141

Server certificate:
308203213082028aa00302010202104f9d96d966b0992b54c2957cb4157d4d30
...
20e90a70641108c85af17d9eec69a5a5d582d7271e9e56cdd276d5792bf72543
1c69f0b8f9

$ PYTHONPATH=. scripts/toytls-verify.py signed_statement
Signature Verification SUCCESS

Bytes signed:
08c247a658bcfe4668d853192dfe9a27c4f7bbc75ca6bc567fdc4726b1628ee8
5013fa565da4ee2d51c9441cf307578161d1f631a3b4784f193bdd7e9d55b598
03001741049c096ff72fe6a7a1bbc5227a7b9806ab0d12129212a3e700138070
42e35fd8f60efc9cda1ecd9bbf61464a179299b43c3cf195956eedd635a7f859
8091910bf3

Bytes signed, user supplied messsage (hex):
08c247a658bcfe4668d853192dfe9a27c4f7bbc75ca6bc567fdc4726b1628ee8

Bytes signed, user supplied messsage (repr):
"\x08\xc2G\xa6X\xbc\xfeFh\xd8S\x19-\xfe\x9a'\xc4\xf7\xbb\xc7\\\xa6\xbcV\x7f\xdcG&\xb1b\x8e\xe8"

Bytes signed, server unix timestamp:
1343486550

Bytes signed, server UTC timestamp:
2012-07-28 14:42:30

Signature:
7c7e3ef5aeea49674e3311112b503c6cfe8149810d5615392d0405939667bf15
95a8c9693cc8d4105a52c85615e7132467757939f72f01354a74882f59463e4d
d76b7eb4ec0de9b6922e2fc3e74336eb0ae619f90f53a2384a1465970a11a9d5
66afd335d3ae9cb2e8f7fd757d5cb5fad530923d29b3df195a963ef699711141

Server certificate. For more details, do:
$ openssl asn1parse -inform DER -in signed_statement.certificate.der

308203213082028aa00302010202104f9d96d966b0992b54c2957cb4157d4d30
...
20e90a70641108c85af17d9eec69a5a5d582d7271e9e56cdd276d5792bf72543
1c69f0b8f9

Conclusion

Happy hacking. :)