So, after more than a year without writing anything here, I was bored today and thought it would be nice to share a new piece on attacking cryptographic implementations here 🙂
Differential Fault Analysis (DFA) attacks are part of what is known as fault injection attacks. This is, they are based on forcing a cryptographic implementation to compute incorrect results and attempt to take advantage from them. With fault injection attacks (also often called active side channel attacks) one can achieve things like unauthenticated access to sensitive functionality, bypassing secure boot implementations, and basically bypassing any security checks an implementation performs.
With DFA attacks, one is able to retrieve cryptographic keys by analyzing correct/faulty output pairs and comparing them. Of course, this assumes you are able to inject faults somehow... which is often true in hardware implementations: gaming consoles, STBs, smart cards, etc. At the software level, one can achieve similar things by debugging the implementation and changing data or by patching instructions... but this is something we have been doing for a long time, haven't we? 🙂 I often say that fault injection attacks are the analog version of 'nopping' instructions out in a program, although we often do not know exactly what kind of faults we are injecting (i.e. we often miss a fault model, but we still successfully attack implementations in this way).
There are ways to protect against this kind of attack as an application programmer, but this is not the objective of this post. In the remainder of this post, I will explain two powerful DFA attacks on two modern cryptographic algorithms: RSA and (T)DES. For some information on protecting from these attacks as a programmer, take a look at these slides. If there is some interest, I will outline the most common techniques to perform fault attacks in a future post.
After a long long while, it's time to go on with our crypto series. Last time we talked about the RSA cryptosystem, and we learned its security is based on the integer factorization problem (plus the DL problem for message secrecy). Today, we'll continue with public key cryptosystems: we'll look into Elliptic Curve Cryptography.
If we are talking about Elliptic Curve Cryptography, first we need to define what an Elliptic Curve is. Mathematically, an Elliptic Curve is a curve with the following equation:
This means that every point (x,y) for which the above expression is met will be part of the curve. However, it turns out in our case we can simplify the equation because the curves we'll be using can generally be written as:
Such a curve, over the reals (i.e. x and y are real numbers) and with a=-3, b = 1, looks like this:
What makes these curves special is that we can define an abelian group with them. To do that, we define the point at infinity and an addition law. The addition law is depicted in the following picture from Wikipedia:
As you can see, if you want to add two points P and Q, you draw a line through them. The intersection of this line and the curve is the point -(P+Q). Then, you just need to invert this point (negate the y coordinate) to obtain the final result.
Of course, we have special cases. If the point is added to itself, the line is defined as the tangent to the curve at that point, as intuitively the tangent touches 'two times' the point.
If we add a point to its inverse, we get a vertical line... and that's a problem because it will never touch the curve. Here is where the point at infinity comes to rescue. The point at inversity is simply 'up there' (and 'down there'), and is the zero element of the group.
Elliptic Curves for Cryptography
We have defined above how an elliptic curve looks like over the reals, and how to perform additions of two points. Obviously, when addition is defined we also have multiplication for free: just add a point to itself several times in a row (although you can do it in smarter and more efficient ways).
But how do we use it for cryptography? I mean, where is the difficult problem here? Actually, the difficult problem is again the discrete logarithm problem. In this case, we define it as follows:
Given a curve E and all its parameters, a base point P and a point Q=nP, obtain n.
And how is this difficult in the curves defined above, you might be thinking... The truth is we do not use real curves in ECC, but we use curves over finite fields instead. We can do it over prime fields GF(p), or we can do it over binary fields GF(2^n). I'll look only at GF(p) here, but similar concepts apply (although the simplified expression I defined above is slightly different in that case).
So, the curve I depicted previously taken over GF(8761) looks like this:
Messy, huh? Exactly the same addition laws apply here, but now when you add two points you draw a line... and when the line gets out of the GF(p) x GF(p) plane it wraps around and comes back from the other side. It is a little more difficult to depict and to visualize, but the concept is the same as before. And now you probably start seeing why this is difficult to solve...
Why Elliptic Curves?
Now you might be wondering... why do we use Elliptic Curve cryptography at all? What are the benefits? The answer is that the ECC allows us to use smaller keys than other algorithms like RSA / 'normal' DL systems for the same amount of security.
This is because the best known general methods for solving the DL in Elliptic Curve are of exponential complexity, while for the other systems we know subexponential methods. Hence, the DL problem under Elliptic Curves is believed to be more difficult than the equivalent base problems for other public key cryptosystems.
Now that we know how elliptic curves are used in cryptography and what benefits they have over traditional
Elliptic Curve Diffie-Hellman
So, if you remember from when we talked about Diffie-Hellman, this is a key exchange protocol that relies on the Discrete Logarithm problem (and the Diffie-Hellman assumption). Usually this is done over a finite field GF(p), but now we have just defined a group based on Elliptic Curves which we can use as well.
In this case, Alice has a private key and a public key , where G is the base point. Similarly, Bob has and . Alice and Bob exchange public keys, and then each of them can compute a common point .
This protocol relies on the assumption that the DL problem is infeasible in the elliptic curve (which requires a base point G of high order) and the Diffie-Hellman assumption.
Other ECC algorithms
Besides the EC Diffie-Hellman algorithm defined above, there are several other algorithms based on Elliptic Curves. For example, one could compute digital signatures using Elliptic Curve DSA or Elliptic Curve Nyberg Rueppel. Each algorithm has its own details, but the important problem used as a foundation for each of them is the Discrete Logarithm problem over Elliptic Curves as we have defined it here.
Beware, however, that similarly to other algorithms, ECC algorithms rely also on other conditions. For example, for ECDSA (and DSA) there is a secret parameter that must be unique, and two signatures with the same value for this parameter will reveal your secret key. As usual, if you implement cryptography. you need to be aware of the requirements and limitations or you will certainly screw up (toc toc SONY!).
Somewhere before the weekend I was discussing about Padding Oracles with a friend and somehow it came up that there was no public tool using timing information for this kind of attacks.
I had seen that Thai and Juliano mentioned timing leaks in their talk at EkoParty, but since AFAIK there was no public tool available I decided to look into it. Also, some weeks ago I added the CBC-R encryption part to my scripts, in order to be able to encrypt arbitrary information as long as we are able to control the IV.
So in this post I'm going to write about these two things: CBC-R encryption and a web based padding oracle attack script using timing information.
I wanted to share with you guys the little challenge I prepared for the Campus Party Europe. The wargame was organized by SecurityByDefault and took place during the last couple of days.
I was asked to prepare a cryptography challenge for it, and I delivered a little problem that became the level 4 challenge in the crypto category. The problem is based around RSA with 2048 bit keys and AES in ECB mode with 128 bit keys.
The idea was to give some real crypto instead of the typical break-classic-crypto or find-the-needle-in-the-haystack challenges. Of course, I am not asking you to factor an RSA-2048 modulo (well, I am, in a way...) nor breaking AES in a mathematical sense because that is not feasible nowadays. You have to find the trick ;-).
Want to challenge yourself? Give it a try!
I'll leave the challenge here, and the solution will be published in SecurityByDefault in some time. If you have questions or want to share ideas with me you can use the comments, but please do not spoil the solution for other readers!
These are the instructions:
In one of our missions we have intercepted an email containing a file encrypted with AES in ECB mode with a 128 bit key. Together with the file there was what we suspect is the AES key encrypted with a 2048 RSA key, which we found to be as follows:
-----BEGIN PUBLIC KEY-----
-----END PUBLIC KEY-----
The encrypted AES key is as follows:
Although it was a tough mission, our Operations team did a great job and was able to provide the following information on the target:
- It uses a cryptographic device that contains a 1024 bit modular exponentiation accelerator
- The device uses the same key for decryption and for signature generation
In addition, the Operations team modified the hardware used by our target and was able to collect a pair of RSA signatures over the same data. One of these signatures contains a fault injected thanks to our hardware modification, while the other one is the correct signature. These are the signature values:
Unfortunately, the team was not able to obtain the private RSA key nor decrypt the AES file. It is critical for the mission to obtain the contents of the encrypted file. Your task is to obtain the contents of the AES file.
PS: All RSA operations are RAW operations. This means no padding, just modular exponentiation. For keys smaller than the modulus, the padding is null (i.e. zero bytes).
And the file encrypted with AES can be found here.
After seeing how the ElGamal system works, today we are going to take a look at the RSA public key cryptosystem. The RSA algorithm was first published by Rivest, Shamir and Adleman in 1978 and is probably the most used crypto algorithm today.
Despite this fact, the algorithm seems to have been invented by Clifford Cocks, a british mathematician who worked for a UK intelligence agency. Since this work was never published due to the top-secret classification, the algorithm received its name from Rivest, Shamir and Adleman who were the first to discuss it publicly. A document declassified in 1997 revealed the fact that Clifford Cocks had actually described an equivalent system in 1973.
Let me remind you once again that these posts are not intended to be 100% accurate in a mathematical sense, but an introduction for people who doesn't know much about cryptography. If you want more accurate and complete descriptions, take a crypto book such as the Handbook of Applied Cryptography I've linked in most of my posts :).
Setting up the RSA algorithm
The RSA algorithm is based on the assumption that integer factorization is a difficult problem. This means that given a large value n, it is difficult to find the prime factors that make up n.
Based on this assumption, when Alice and Bob want to use RSA for their communications, each of them generates a big number n which is the product of two primes p,q with approximately the same length.
Next, they choose their public exponent e, modulo n. Typical values for e include 3 (which is not recommended!) and (65537). From e, they compute their private exponent d so that:
Where is the Euler's totient of n. This is a mathematical function which is equal to the number of numbers smaller than n which are comprimes with n, i.e. numbers that do not have any common factor with n. If n is a prime p, then its totient is p-1 since all numbers below p are comprimes with p.
In the case of the RSA setup, n is the product of two primes. In that case, the resulting value is lcm((p-1)(q-1)) because only the multiples of p and q are not comprimes with n.
Once our two parties have their respective public and private exponents, they can share the public exponents and the modulus they computed.
Encryption with RSA
Once the public key (i.e. e and n) of the receiving end of the communication is known, the sending party can encrypt messages like this:
When this message is received, it can be decrypted using the private key and a modular exponentiation as well:
sage: p=random_prime(10000) sage: q=random_prime(10000) sage: n=p*q sage: p,q,n (883, 2749, 2427367) sage: e=17 sage: G=IntegerModRing(lcm(p-1,q-1)) sage: d = G(e)^-1 sage: G(d)*G(e) 1 sage: m=1337 sage: G2=IntegerModRing(n) sage: c=G2(m)^e sage: c 1035365 sage: m_prime=G2(c)^d sage: m_prime 1337
In the commands above, I first create two random primes below 10000 and compute n. Then I create a IntegerModRing object to compute things modulo lcm(p-1,q-1) and perform the computation of the private exponent as the inverse of the public exponent on that ring.
Next, I create a new ring modulo N. Then I can use the public exponent to encrypt a message m and use the private exponent to decipher the cryptotext c... and it works!
Correctness of RSA encryption/decryption
We have seen it works with our previous example, but that doesn't prove that it really works always. I could have chosen the numbers carefully for my example and make them work.
Euler's theorem tells us that given a number n and another number a which does not divide n the following is true:
Therefore, and since , for any message m that does not divide n the encryption and decryption process will work fine. However, for values of m that divide n we need to use more advanced maths to prove the correctness.
Another way to prove it is to use Fermat's little theorem and the Chinese Remainder Theorem. I will explain these theorems in my next post and then I will provide a complete proof based on them.
RSA for signing
In the case of RSA, digital signatures can be easily computed by just using d instead of e. So, for an RSA signature one would take message m and compute its hash H(m). Then, one would compute the signature s as:
For verifying the signature, the receiving end would have to compute the message hash H(m) and compare it to the hash contained in the signature:
Therefore, if the hash computed over the received message matches the one computed from the signature, the message has not been altered and comes from the claimed sender.
Security of RSA
In order to completely break RSA, one would have to factor n into it's two prime factors, p and q. Otherwise, computing d from e would be hard because (p-1) and (q-1) are not known and n is a large number (which means that computing its totient is also difficult).
In a few posts I will show an algorithm to solve the factorization problem. However, another way to break RSA encrypted messages would be to solve a discrete logarithm. Indeed, since , if one solves the discrete logarithm of c modulo n, the message would be recovered.
Luckily, we already know that discrete logs are not easy to compute. And in this case, solving one does not break the whole system but just one message.
Let's go a little further in our way to understand the way the DNIe works. In my previous post I talked about the device authentication procedure and today I'll talk about what happens next, how Secure Messaging protects all the subsequent communication.
By the way, I updated the previous post with information on how to get the card's serial number.
Device authentication, quick reminder
As I said in the previous blog, the device authentication phase consists of the following steps:
- Certificate exchange: The terminal (IFD) requests a X.509 certificate from the card and sends its own certificate and an intermediate CA's certificate to the card
- Internal authenticate: The IFD sends a random challenge to the card and requests it to authenticate itself. This is done with an RSA signature, which is then encrypted for the IFD, and includes a 32 byte random number known as Kicc.
- External authenticate: The terminal authenticates itself, requesting a challenge from the card and sending a signed and encrypted message to the card. Again, this message includes a 32 byte random number known as Kifd.
- Key generation: both ends generate a key for encryption and a key for authentication. This is done by XORing first the two random numbers, and computing then the SHA-1 hash of the result with a constant 1 appended for the encryption key and a constant 2 for the authentication (MAC) key.
So basically at the end of this process, both ends share a pair of keys that can be used for protecting the confidentiality and the integrity of subsequent messages.
Let's see how this is done.
In our last post we learnt about the Discrete Lograithm problem, why it is a difficult problem and how we can attempt to solve it if the numbers are manageable. Of course, in a real setting we wouldn't use 16 bit numbers as in my example, but at least 1024 bit numbers nowadays (and most likely even bigger numbers).
Now, we are going to see how to make use of that problem to create a public key cryptosystem. We will look at how ElGamal uses the DL problem to provide public key encryption and digital signatures. Keep on reading if you are interested!
From last post, it becomes clear that at this stage we won't be able to make it without some maths. That's because we are dealing now with public key crypto, which is based on difficult mathematical problems (as in difficult to solve, not as in difficult to understand).
With symmetric crypto, we could understand the concepts of diffusion and confusion without needing to dive into maths. On the other hand, here we will need to understand the problems on which the algorithms rely in order to understand how they work.
In this post, we'll see what's the Discrete Logarithm problem, why it is difficult to solve based on a simple intuition, and finally a method to solve this kind of problems. Of course it's not the only (nor the best) existing method, but in my opinion it is the simplest one to understand.
In the previous post, I said I'd write about the Discrete Logarithm problem in the next post. However, I forgot to mention the general idea behind digital signatures. Since I can't sleep right now and have to take a train to the airport in a couple of hours, I decided to go ahead and write a few lines about digital signatures ;-).
The basic idea behind digital signatures is to make use of the fact that in public key cryptography a user has a private key which is never disclosed to anyone in order to authenticate the user or messages generated by that user.
In a symmetric setting, authentication is performed using MAC or HMAC mechanisms, and at least two parties know the key used to generate those messages. Therefore, a given party could deny that he or she generated a given authenticated message, because he is not the only one who knows that key and therefore there is no proof that he did generate the message.
Of course, if only two parties know the key, and one of the parties knows that a particular message was not generated by himself, then it must come from the other party. However, in a legal dispute, there is no way to prove that and to an external observer both of the options are equally likely.
To solve that issue, digital signatures generate a sort of authentication code using a private key, never disclosed to anyone. Then, using the related public key, everyone can verify that signature and therefore be sure that the message came from that user. Since that entity is the only one knowing the private key, this sort of construction can be used to bind a user to a message and resolve any legal disputes that might arise.
Normally, you can see the digital signature generation process as some sort of encryption with a private key. On the other hand, you can imagine the signature verification (or opening) phase as a decryption using the public part of the key.
Practical usage of digital signatures
In real world, documents are usually way larger than the message length that common digital signature algorithms can handle directly. Since authenticating each chunk of a document is not very practical (asymmetric crypto is usually slooooow), in practice a cryptographic hash is computed over the document, and the hash is signed using the private key and the signature algorithm.
Then, in the verification stage, a second hash is computed and compared against the signed hash. If they match, the signature is correct and therefore the received document was created by the signing party and has not been modified.
Of course, this assumes that cryptographic hash functions behave as expected, and there are no collisions. Ohterwise, if one might find another document which produces the same hash (and thus the same signature), any legal proof that the document was created by the private key holder would be destroyed.
Therefore, choosing secure hash functions for usage within digital signatures is a crucial issue. As an example problem that arose due to the use of insecure hash functions with digital certificates, check the Hashclash project.
As some of you might have noticed already by looking at the title, this post will be the first one talking about public key cryptography. Today, I'll introduce the basic ideas around public key crypto and the ideas proposed by Diffie and Hellman in their famous paper 'New Directions in Cryptography' from 1976.
In subsequent posts, we well look at the discrete logarithm problem and the factorization problem. We'll also look into some public key cryptosystems, such as El-Gamal and RSA. And after that, we'll look at Elliptic Curve Cryptography. With all this, the algorithms part of this series will be considered closed and I'll move into cryptographic systems and protocols ;-). Stay tuned!