Summary:

At the core of the TLSNotary protocol is dividing the TLS session keys between two parties (Client and Notary) and then using secure two-party computation (2PC) to generate an encrypted and authenticated request which Client sends to a TLS-enabled webserver.

During the run of the protocol neither Client nor Notary know none of the full TLS session keys, they only know their shares of those keys. This preserves the security assumptions of TLS while at the same time allowing Client to prove to Notary the provenance of the TLS transcript.

Below is an in-detail description of each step of the TLSNotary protocol:

Table of Contents

  1. Computing shares of the pre-master secret
  2. Using garbled circuit to derive shares of TLS session keys from shares of PMS
  3. Encrypting the request
  4. Computing MAC of the request using Oblivious Transfer
  5. Receiving the webserver response
  6. Contents of the notarization document
  7. Verifying the notarization document

1. Computing shares of the pre-master secret.

In TLS, the first step towards obtaining TLS session keys is to compute a shared secret between the client and the server by running the ECDH protocol (https://en.wikipedia.org/wiki/Elliptic-curve_Diffie–Hellman). The resulting shared secret in TLS terms is called the pre-master secret (PMS).

Using the notation from Wikipedia, below is the 3-party ECDH protocol between the server (S) the client (C) and the notary (N), enabling the client and the notary to arrive at shares of PMS.

  1. S sends its public key QbQ_b to C and C passes it to N

  2. C picks a random private key share dcd_c and computes a public key share Qc=dcGQ_c = d_c * G

  3. N picks a random private key share dnd_n and computes a public key share Qn=dnGQ_n = d_n * G

  4. N sends QnQ_n to C who computes Qa=Qc+QnQ_a = Q_c + Q_n and sends QaQ_a to S

  5. C computes an EC point (xp,yp)=dcQb(x_p, y_p) = d_c * Q_b

  6. N computes an EC point (xq,yq)=dnQb(x_q, y_q) = d_n * Q_b

  7. Addition of points (xp,yp)(x_p, y_p) and (xq,yq)(x_q, y_q) results in the coordinate xrx_r, which is PMS. (The coordinate yry_r is not used in TLS)

(Using the notation from https://en.wikipedia.org/wiki/Elliptic_curve_point_multiplication#Point_addition)

Our goal is to compute xr=(yqypxqxp)2xpxqx_r = (\frac{y_q-y_p}{x_q-x_p})^2 - x_p - x_q in such a way that a) neither party learns the other party's xx value, and b) neither party learns xrx_r. The parties must only learn their shares of xrx_r.

Let's start out by simplifying the equation

xr=(yq22yqyp+yp2)(xqxp)2xpxqmodpx_r = (y_q^2 - 2y_qy_p + y_p^2)(x_q-x_p)^{-2} - x_p - x_q \mod p

Since this is finite field arithmetic, if the result xrx_r is larger than pp, we must "reduce xrx_r modulo pp", i.e assign to xrx_r the value xrmodpx_r \mod p. The trailing modp\mod p is always implied from here on out but may be omitted for brevity. (If you're curious, pp of the most common EC curve P-256 is a prime number and its value is 22562224+2192+29612^{256} - 2^{224} + 2^{192} + 2^{96} - 1)

Based on Fermat's little theorem (https://en.wikipedia.org/wiki/Fermat's_little_theorem), a2modp=ap3modpa^{-2} \mod p = a^{p-3} \mod p. Replacing the negative power, we get:

xr=(yq22yqyp+yp2)(xqxp)p3xpxqx_r = (y_q^2 - 2y_qy_p + y_p^2)(x_q-x_p)^{\color{red} \bf p-3} - x_p - x_q

We will compute xr=AB+Cx_r = A*B + C using the Paillier cryptosystem (https://en.wikipedia.org/wiki/Paillier_cryptosystem) which has some neat homomorphic properties which allow us to:

In the beginning, Notary generates a Paillier keypair and sends the public key to Client. E()E() denotes an encrypted value. The parties proceed to compute:

1.1. Computing A=(yq22yqyp+yp2)A = (y_q^2 - 2y_qy_p + y_p^2)

Notary:

Client:

(Note that here NAN_A (as well as NbN_b and NBN_B below) is crucial, as without it Notary would be able to factorize AMAA*M_A and learn A.)

Notary:

1.2. Computing B=(xqxp)p3B =(x_q-x_p)^{p-3}

Notary:

Client:

Notary:

Client:

Notary:

1.3. Computing A*B+C

Notary

Client:

Notary:

The protocol described above is secure against Notary sending malicious inputs. Indeed, because Client only sends back masked values, Notary cannot learn anything about those values.

2. Using garbled circuit to derive shares of TLS session keys from shares of PMS.

Using Garbled Circuit (GC) (https://en.wikipedia.org/wiki/Garbled_circuit), each party provides their PMS share as an input to the circuit. Notary acts as the garbler. Client acts as the evaluator. The circuit combines the PMS shares, computes TLS's PRF function and outputs XOR shares of TLS session keys to each party. Also the circuit builds the Client Finished message and checks the Server Finished message.

In order to prevent malicious garbling usually the protocol called Authenticated Garbling (AG2PC) (https://eprint.iacr.org/2017/030) would be used. However, AG2PC has a high bandwidth requirement: ~500 bytes need to sent for each AND gate of the circuit. For a standard 2KB request, using AG2PC would mean Client has to download 320MB of data from Notary.

We describe a technique which allows us to reduce the bandwidth by a factor of 5 compared to AG2PC. The essense of it is that both Notary and Client take turns acting as a garbler and as an evaluator of the same circuit:

  1. N acts as the garbler and C acts as the evaluator of a circuit. C evaluates the circuit, gets output OUTcOUT_c.
  2. C acts as the garbler and N acts as the evaluator of the same circuit. N evaluates the circuit, gets output OUTnOUT_n.
  3. C sends to N a hash commitment H(OUTc+salt)H(OUT_c + salt).
  4. N sends to C OUTnOUT_n.
  5. C check that OUTn=OUTcOUT_n = OUT_c, otherwise aborts because N was cheating.
  6. C reveals the committed values by sending to N OUTc+saltOUT_c + salt. N check the commitment and that OUTn==OUTcOUT_n == OUT_c, otherwise aborts because C was cheating.

The protocol above assumes that neither OUTnOUT_n nor OUTcOUT_c leak the other party's secrets. To ensure no leakage, the circuit is constructed so that it doesn't output plaintext values but instead it outputs XOR-masked values. Each party provides their XOR masks as an input to the circuit.

The protocol above also assumes that each party will provide the exact same inputs for two circuits: one time when acting as the garbler and the second time when acting as the evaluator. Suppose, Notary garbles a circuit maliciously. Then, when Notary acts as the evaluator, he has to provide such inputs to the circuit (honestly garbled by Client) that the output of the honest circuit matches the output of the malicious circuit. Since each of the circuits' output comes after performing either sha256 or AES on the shares of the parties' inputs, the likelihood of Notary succeeding equals to Notary's guessing what the Client's input to the circuit is.

3. Encrypting the client request.

With each party having their shares of client_write_key and client_write_IV as inputs to the garbled circuit, we use the same technique as in 2. to compute the AES-ECB-encrypted counter blocks. The circuit outputs this value only to Client who xors the request's plaintext with the encrypted counter blocks to get the ciphertext. Client passes the ciphertext to Notary in order to jointly compute a MAC (see below 4.). After MAC is computed, Client sends the encrypted request with MAC to the webserver.

4. Computing MAC of the request using Oblivious Transfer.

In AES-GCM, MACs are computed using the GHASH function described in the NIST publication https://nvlpubs.nist.gov/nistpubs/legacy/sp/nistspecialpublication800-38d.pdf in section 6.4: "In effect, the GHASH function calculates X1HmX2Hm1...Xm1H2XmHX_1•H^m ⊕ X_2 •H^{m-1} ⊕ ... ⊕ X_{m-1} •H^2 ⊕ X_m •H." Here XnX_n is the n-th 16-byte ciphertext block and HnH^n are powers of HH (AES-ECB-encrypted zero), is XOR, and is block multiplication as defined in section 6.3.

Our goal is to compute block multiplication of each block XnX_n with the corresponding power HmnH^{m-n}. The pseudocode of xyx•y (as per section 6.3) is:

1. let res = 0;
2. let R = 0xE1000000000000000000000000000000;
3. for (let i=127; i >= 0; i--)
4.    res = res ^ (x * ((y >> i) & 1));
5.    x = (x >> 1) ^ ((x & 1) * R);
6. return res;

Suppose, Client's value is yy and Notary's value is xx. As can be seen in line 4, depending on whether a bit of y is 1 or 0, a modified x value is either xored to the result or not. The modified x values in line 5 can be pre-computed by Notary in advance: a total of 128 values x0,x1,...,x127x_0, x_1, ..., x_{127}

Notary picks 128 distinct xor masks and appllies them to each value xnx_n. Using the 1-out-of-2 Oblivious Transfer (OT)(https://en.wikipedia.org/wiki/Oblivious_transfer#1–2_oblivious_transfer), for every bit of y, Client asks Notary to obliviously send to him either a masked xnx_n if the bit is 1 or the mask itself if the bit is 0. In the end, Notary's sum of all xor masks and Client's sum of all values received will be their shares of the product.

This technique is sufficient to compute shares of any power of HH, e.g. let's see how the parties compute H3H^3, where HnH_n and HcH_c are Notary's and Client's shares of H.

H3=(Hc+Hn)3=Hc3+3Hc2Hn+3Hn2Hc+Hn3H^3 = (H_c + H_n)^3 = H_c^3 + 3H_c^2H_n + 3H_n^2H_c + H_n^3

Since addition is defined as xor, then 3Hc2Hn=Hc2HnHc2HnHc2Hn=Hc2Hn3H_c^2H_n = H_c^2H_n ⊕ H_c^2H_n ⊕ H_c^2H_n = H_c^2H_n, thus

H3=Hc3+Hc2Hn+Hn2Hc+Hn3H^3 = H_c^3 + H_c^2H_n + H_n^2H_c + H_n^3

The parties will compute locally Hc3H_c^3 and Hn3H_n^3 respectively. Then, using OT they will compute shares of Hc2HnH_c^2H_n and Hn2HcH_n^2H_c. H3H^3 is the xor-sum of locally computed values and shares computed with OT.

Additional optimizations:

A) Free squaring.

Suppose the parties start with shares of HxH^x, then the shares Hx2,Hx3,Hx4,...H^{x^2}, H^{x^3}, H^{x^4}, ... can be computed locally without OT. To demonstrate, if parties start with shares of H2H^2 and want to compute shares of H4H^4, it can be seen that the middle terms are all equal to 0, because xoring a value to itself an even amount of times results in 0:

H4=(Hc2+Hn2)2=Hc4+4(Hc3Hn)+6(Hc2Hn2)+4(Hn3Hc)+Hn4=Hc4+Hn4 H^4 = (H^2_c+H^2_n)^2 = H_c^4 + 4(H^3_cH_n) + 6(H^2_cH^2_n) + 4(H^3_nH_c) + H_n^4 = H_c^4 + H_n^4

B) Block aggregation.

With parties having HnH_n and HcH_c and having computed shares locally using free squaring, they proceed to compute, e.g. H3X3+H5X5H^3X_3 + H^5X_5, where X3X_3 and X5X_5 are ciphertext blocks corresponding to the powers of HH for the GHASH function. (Hn2H^2_n means Notary's share of H2H^2)

Expanding:

H3X3=(Hn2+Hc2)(Hn+Hc)X3=Hn2HnX3+Hn2HcX3+Hc2HnX3+Hc2HcX3H^3X_3 = (H^2_n+H^2_c)(H_n+H_c)X_3 = H^2_nH_nX_3 + H^2_nH_cX_3 + H^2_cH_nX_3 + H^2_cH_cX_3 and H5X5=(Hn4+Hc4)(Hn+Hc)X5=Hn4HnX5+Hn4HcX5+Hc4HnX5+Hc4HcX5H^5X_5 = (H^4_n+H^4_c)(H_n+H_c)X_5 = H^4_nH_nX_5 + H^4_nH_cX_5 + H^4_cH_nX_5 + H^4_cH_cX_5

Let K1n,K2n,...K1_n, K2_n, ... be values which Notary can compute locally

Let K1c,K2c,...K1_c, K2_c, ... be values which Client can compute locally

Given that all values of XnX_n are known to both parties, we rewrite the expanded sum of H3X3+H5X5H^3X_3 + H^5X_5:

K1n+K2nHc+K2cHn+K1c+K3n+K4nHc+K4cHn+K3cK1_n + K2_nH_c + K2_cH_n + K1_c + K3_n + K4_nH_c + K4_cH_n + K3_c

The parties only need to do two block multiplication using OT (the rest can be computed locally):

K2nHc+K2cHn+K4nHc+K4cHn=(K2n+K4n)Hc+(K2c+K4c)HnK2_nH_c + K2_cH_n + K4_nH_c + K4_cH_n = (K2_n+K4_n)H_c + (K2_c+K4_c)H_n

The parties could have aggregated more X values, e.g. H3X3+H5X5+H9X9+H17X17H^3X_3 + H^5X_5 + H^9X_9 + H^{17}X_{17} would still require only 2 block multiplication using OT.

The best strategy (which involves the least amount of OT) is for the parties to first compute shares of powers 3,5,7,9,11 ..., then apply "free squaring" to those shares and then run block aggregation.

5. Receiving the server response.

When Client receives a response from the webserver, he sends to Notary a hash commitment to the response. Notary reveals to Client Notary's TLS session keys' shares. Client combines his key shares with Notary's key shares and gets full TLS session keys. Client proceeds to authenticate the response and to decrypt it.

6. Contents of the notarization document.

The notarization document includes the following elements signed by Notary:

  1. Client's hash commitment to the server response
  2. Client's hash commitment to the Client's shares of TLS session keys
  3. Client's hash commitment to the Client's share of the PMS
  4. GHASH inputs used when computing MAC for the request
  5. Webserver's ephemeral pubkey used when computing shares of PMS
  6. Notary's PMS share
  7. Notary's TLS session keys
  8. Notarization timestamp

In addition to the elements signed by Notary, Client also includes in the notarization document the following:

  1. Notary's signature over the above 8 elements
  2. x509 certificate chain from the webserver's "Certificate" TLS message
  3. client_random and server_random values of the TLS session
  4. Webserver's signature over ephemeral pubkey and random values from the "Server Key Exchange" TLS message
  5. Client's TLS session keys
  6. Client's share of the PMS
  7. Webserver response with MAC

7. Verifying the notarization document.

Any third party who trusts the Notary's pubkey and has a list of trusted root CA certificates can verify the notarization document by performing the following: