An Exploration of 2P-ECDSANick Mooney January 23rd, 2020 (Last Updated: January 22nd, 2020)
2P-ECDSA is an implementation of two-party threshold ECDSA proposed by Yehuda Lindell of Bar-Ilan University. Lindell's work builds on top of a litany of other research in the field of threshold digital signatures. While I am hoping to make this article as accessible as possible, it is written for an audience who has some experience with elliptic curve cryptography.
As we come to desire new properties from our cryptographic infrastructure, we also have to contend with the reality that many deployed solutions were designed without those new properties in mind. Format-preserving encryption gives us a concrete example of this: a payments processor may want to encrypt credit card numbers at rest, but may also be required to use a legacy system that was only designed to store 16-digit numbers -- not encrypted binary blobs. Using format-preserving encryption, the payments processor can overcome this limitation by producing ciphertext that is encoded as a 16-digit number.
Threshold signature schemes such as threshold ECDSA are interesting because they are designed with a limitation in mind: the signatures produced by the multiparty forms of the algorithm must be verifiable by vanilla ECDSA. In other words, the scheme must work without requiring any changes to the validating party. The problem and solution are cryptographic in nature, but the bounds of the problem are defined by practical reality: ECDSA is already widely deployed.
In this article we explore some of the context of threshold cryptography while also looking at a particular problem and the specific cryptographic constructions used to solve the problem.
02. Threshold Cryptography
Threshold cryptography is the field of cryptography that focuses on operations that require the participation of multiple entities. An accessible example of this idea is the sharing of nuclear launch codes: if we want to ensure that no single individual can launch the missiles, but any two authorized individuals can come together and do so, we can design systems that ensure this cryptographically.
Threshold cryptography is usually defined in terms of a threshold value
t and a group size
n, where some
t+1 group members (at minimum) are required to perform some action, but up to
t members can collude without any advantage over what they individually possess.
We will focus specifically on threshold ECDSA (Elliptic Curve Digital Signature Algorithm) in this article, but many other kinds of threshold cryptography are possible. Threshold cryptography itself can be viewed as a subset of multiparty computation, which allows participants to collectively compute the output of a function over a combination of the data each individual possesses, while keeping each user's contribution to the input data private to only that user.
The (arguably) most well-known example of threshold cryptography is classical secret sharing. Shamir's Secret Sharing (SSS) is the go-to construction, and while we won't go into nuances here, the basic scheme can be described as follows:
A secret is created and must be "shared," or split into
n pieces. Any
k <= n) pieces can be used to reconstruct the secret, but any
k-1 or fewer pieces do not give an adversary any information about the secret. For example, we could use SSS to split a private key into 4 pieces (
n) and require at least 3 pieces (
k) to be present before the key can be reconstructed.
What we refer to as “pieces” above are really points on a polynomial. The secret to be shared is encoded in the polynomial, and then points along the polynomial are distributed to each party. Imagine a simple parabola (a polynomial of degree 2), such as
f(x) = 3x^2 + 2: any 3 points along the parabola can be used to recover the unique parabola equation, but 2 or fewer points could fit on an infinite number of parabolas. This is the property that allows us to require that a certain number of members participate to recover a secret in Shamir’s Secret Sharing.
In a bit more detail, to share a secret by splitting it into pieces, the holder of the secret follows these steps:
- Construct a polynomial* of degree
k-1where the coefficient of the
x^0term is the secret, and the coefficients of the
x^i ... x^k-1terms are randomly generated. This looks like
f(x) = a_0 + a_1*x^1 + ... + a_(k-1)*x^(k-1)where
a_0is the secret.
npoints from the resultant polynomial
- Distribute each point
* This polynomial must be over a finite field
F_p over the integers mod
p is prime. The notation is similar to polynomials over the real numbers, but polynomials over the reals do not provide the same properties.
To recover the secret, a quorum of
k participants come together and perform polynomial interpolation: fitting a curve to the given points. The security of this scheme is derived from the fact that
k points are needed to describe a unique polynomial of degree
k-1, whereas there are infinite polynomials of degree
k-1 that can be fit onto
Issues with Classical Secret Sharing
The polynomial interpolation described above is not a distributed process, which means that one participant ends up with the reconstructed secret value. A dishonest participant could store the reconstructed secret and use it unilaterally. In addition, the secret must first be constructed and held by a single party before being split into shares. As a result, there is some trust required: participants must act honestly and delete the secret value after its use, keeping only their shares.
These concerns can be mitigated somewhat by attempting to ensure the integrity of the participants before the secret distribution or reconstruction ceremony occurs, but newer cryptographic constructions can eliminate this class of risk.
03. Digital Signatures and ECDSA
Digital signatures are often used to attest to the provenance and integrity of a message. These properties are often important for things like signing software updates or validating the identity of a party that you are communicating with over an otherwise untrusted channel (as is the case with TLS). In the cryptocurrency example above, signatures are used to prove that participant X legitimately intended to send a transaction to participant Y.
ECDSA stands for Elliptic Curve Digital Signature Algorithm, and is a digital signature scheme based on the cryptographic properties of elliptic curves. Nick Sullivan has written a great primer on Elliptic Curve Cryptography, so I won't rehash it here except to give the necessary context for 2P-ECDSA.
In the nuclear launch codes example above, we gave a scenario that may benefit from requiring a quorum of participants to authorize an action. ECDSA was designed with single parties in mind, and a quorum approach could be implemented with the multi-signature scheme mentioned in the Appendix. However, if we would like to be able to construct signatures that require the participation of more than one party without changing the ECDSA scheme, we'll need a way to collaboratively generate a standard ECDSA signature.
In this article we are looking at a scheme that requires exactly two parties. There are other threshold ECDSA schemes that can be extended to any number of parties and any threshold value, including the Gennaro paper from 2019 that supports fast trustless setup and multiparty signatures with any
t <= n. Lindell's work builds most directly on top of MacKenzie and Reiter's Two-Party Generation of DSA Signatures, and the proof techniques in Lindell's paper have recently been improved upon by Castagnos et. al.
We will assume that two parties, Alice and Bob, have agreed on a set of curve parameters to use. This includes the elliptic curve equation and finite field, a generator point
G, and the integer order
G. Note that failing to keep these parameters consistent between parties can have disastrous consequences in the form of invalid curve attacks.
Under classic (i.e. non-threshold) ECDSA, Alice wishes to sign a message so that Bob can verify it. Alice picks an integer in the field (i.e. integers mod
d, which is her private key. She keeps this value secret. She then multiplies her (scalar) private key
d with the generator point, to get
d*G = Q, where
Q is Alice's public key.
The steps to generate a signature are as follows, with a couple nuances left out:
zbe an integer representing the message Alice wishes to sign, usually in the form of a truncated hash
- Select a random
kfrom the field of integers mod
- Calculate the point
(x, y) = k * G
r = x mod n
s = k^-1 (z + rd) mod n(remember that
dis Alice's private key)
(r, s) is the ECDSA signature.
Where Things Get Tricky
In the above description of ECDSA, it is important that both
k (the random nonce) and
d (the private key) remain secret. Classical secret sharing of these values is possible, but runs into the same issues described previously: one party will always end up with the full private key, meaning that the private key could be reused unilaterally by an untrustworthy participant. We want to replace Alice with two cooperating participants.
What we need is a way to calculate
(r, s) without exposing
d to either party. There are certain signature schemes such as Schnorr constructions which lend themselves easily to threshold schemes, as the operations on secret values are all representable as linear equations. ECDSA, however, requires the calculation of
k^-1. Inversion of
k is tricky to do in a distributed fashion.
04. Lindell's 2P-ECDSA
Paillier Homomorphic Encryption
2P-ECDSA makes use of Paillier encryption for certain parts of the signing operation. Paillier encryption is additively homomorphic. Additive homomorphism means that the following is possible:
Enc(m)produced by party 1, party 2 can generate
xis a scalar) without knowing
Enc(m_2)produced by party 1, party 2 can generate
Enc(m_1 + m_2)without knowing
In short, given only ciphertext, it is possible to produce new ciphertext corresponding to the multiplication of the plaintext by a scalar, or the addition of two plaintexts, without actually knowing the plaintext. The party who generated the original ciphertext can then decrypt the new ciphertext.
Paillier encryption is an asymmetric scheme, meaning that
P_2 can also generate ciphertexts of arbitrary data that can only be decrypted by
P_1 (given the Paillier public key of
P_1). In other words, given message
m and the public key of
P_2 can generate
Enc(m), but given
P_2 can only perform the above multiplication and addition operations.
We have two parties,
P_2, who would like to generate a signature together. The parties respectively generate and keep secret
d_2, their shares of the private key (such that
d = d_1 * d_2), as well as
k_2, their shares of the nonce (such that
k = k_1 * k_2). Both parties have the same curve parameters and the same message representation
z, as these are not secret.
First, the parties must generate
(x, y) = k * G so that they can generate
r = x mod n. This works just like Elliptic Curve Diffie Hellman:
P_1 can share
k_1 * G without disclosing
P_2 can multiply the result with
k_2 to get
(x, y) = k_1 * k_2 * G = k * G. The same happens in the other direction:
k_2 * G and also calculates
(x, y) = k_2 * k_1 * G = k * G. So far, so good: both parties can calculate
r = x mod n, and neither has disclosed their share of the nonce
Now, Paillier encryption comes in handy.
ckey = Enc(d_1).
ckey can be shared with
P_2 can perform certain operations on
ckey due to the homomorphic properties of Paillier encryption, but
P_2 still does not know
P_2 also has the Paillier public key of
P_1, so can compute ciphertexts that it can then combine additively with the ciphertexts received from
This allows for the following flow:
ckey = Enc(d_1)with
k_2^-1 * zand encrypts it to the Paillier pubkey of
Enc(k_2^-1 * z)
k_2^-1 * r * d_2. This is a scalar value, so with access to
ckeybut no knowledge of
P_2can now compute
Enc(k_2^-1 * r * d_2 * d_1)
P_2now has two ciphertexts encrypted under the Paillier pubkey of
P_1, which thanks to homomorphism, it can add together, producing
Enc(k_2^-1 * z + k_2^-1 * r * d_2 * d_1)
This resultant value, encrypted to the Paillier pubkey of
P_1, is what Lindell calls an "almost-signature". Given the definition of an ECDSA signature above and remembering that
k = k_1 * k_2, all that is required to finalize the signature is for
P_1 to decrypt the ciphertext and multiply the resultant plaintext by
This produces the value
(k_1^-1) * (k_2^-1 * z + k_2^-1 * r * d_2 * d_1), which simplifies to
k_1^-1 * k_2^-1 * z + k_1^-1 * k_2^-1 * r * d_2 * d_1 and then
k^-1 * (z + rd). This is exactly the value of
s we needed to compute to form an ECDSA signature. (I have omitted the
mod n in this description, but remember that all calculations are done under a finite field).
P_1 now has the complete
(r, s) tuple that constitutes an ECDSA signature.
A Note About Honest Parties
In the scheme above, we have ensured the following properties:
- Each party's shares of
dare kept private to their respective party, shared only under Paillier encryption or via an ECDH-like approach (i.e. sharing
k_1 * Gwith
- Both parties are required to perform the signing ceremony
- Neither party ends up with
d, and cannot perform any unilateral signatures under the shared key in the future
This is all fine and good, but many questions arise when we start to question intentionally malicious parties. Lindell has provided zero-knowledge proofs that ensure the integrity of this signing scheme even in the presence of malicious parties, but an exploration of those proofs would make this post a lot longer (and also require me to be a lot more comfortable with ZK proofs(!)).
05. Distributed Trust and the Future
Lindell’s 2P-ECDSA construction is made possible by homomorphic encryption, a type of cryptography that we may not expect to find in a signature scheme—yet it’s interesting that this somewhat modern technique is the bridge between a novel signature scheme and a “legacy” algorithm (i.e. one that was not designed with multiple parties in mind).
Trust has various sources in the digital world. Sometimes trust is placed in another individual, entity, or process. We expect our banks to retain the money we deposit with them. This expectation is backed by legal structures. When we ensure trust cryptographically, we are sourcing our trust in the behavior of mathematics* rather than the behavior of another entity. A mutual lack of trust is not an indictment, accusation, or prediction of malicious behavior; it is simply a safe starting point. Establishing trust between multiple parties requires overhead, and systems that do not require mutual trust can eliminate this overhead and reduce the risk to which the parties are exposed.
In many cases, our expectation that math will continue to work as we understand it is stronger than our expectation that other entities will continue to behave with our best interests in mind.
While we can build systems that are distributed by design, I also expect that we will continue to see novel “retrofits” of distributed trust properties into existing cryptographic systems. 2P-ECDSA is just one example of this.
* This is a simplified view of cryptographic assurance. Most real-world breaks in cryptographic systems are due to implementation issues, not offensive cryptanalysis.
"Multi-signature" vs. Threshold
I want to take a brief detour to focus on the idea of multi-signature operations, which are frequently used in cryptocurrencies. Multi-signature transactions are transactions that are not valid without the signatures of multiple participants, and a quorum system could be implemented in this way. Multi-signature schemes have the following drawbacks in comparison to threshold approaches:
- Members of the quorum can be identified, since they each perform signatures with their own keypair
- Explicit support for multi-signature schemes is required, whereas some threshold approaches can be validated with existing cryptographic tools intended for the single-party use case