Crypto Primer

Download as pdf or txt
Download as pdf or txt
You are on page 1of 28

SECTION 2

CRYPTOGRAPHY – GENERAL OVERVIEW

Encryption is a process of converting information to a disguised form in order


to send it across a potentially unsafe channel. The reverse process is called
decryption. Using strong encryption techniques, sensitive, valuable
information can be protected against organized criminals, malicious hackers,
or spies from a foreign military power, for example. Indeed, cryptography
used to be almost exclusively a tool for the military. However, in moving into
an information society, the value of cryptography in everyday life in such
areas as privacy, trust, electronic payments, and access control has become
evident. In this way, the field of cryptography has broadened from classical
encryption techniques into areas such as authentication, data integrity, and
non-repudiation of data transfer

Basic Terminology

Suppose that someone wants to send a message to a receiver, and wants to


be sure that no-one else can read the message. However, there is the
possibility that someone else opens the letter or hears the electronic
communication.

In cryptographic terminology, the message is called plaintext or cleartext.


Encoding the contents of the message in such a way that hides its contents
from outsiders is called encryption. The encrypted message is called
ciphertext. The process of retrieving the plaintext from the ciphertext is
called decryption. Encryption and decryption usually make use of a key,
and the coding method is such that decryption can be performed only by
knowing the proper key.

Cryptography is the art or science of mathematical techniques related to


such aspects of data security as

• confidentiality, or keeping secret the content of information from


unauthorized parties;
• data integrity, or detecting the unauthorized alteration of data;
• authentication, or identifying either entities or data origins;
• non-repudiation, or preventing an entity from denying previous
commitments or actions.

Cryptanalysis is the study of mathematical methods which are used in


attempting to defeat cryptographic techniques. Cryptology means the study
of cryptography and cryptanalysis.

ALGORITM
The following are commonly used cryptographic algorithms and methods and
explains the basic underlying concepts.

Public-Key Cryptosystems (Asymmetric Ciphers)

Public-key cryptosystems were invented in the late 1970's, with some help
from the development of complexity theory around that time. It was
observed that based on a problem so difficult that it would need thousands of
years to solve, and with some luck, a cryptosystem could be developed which
would have two keys, a private key and a public key. With the public key one
could encrypt messages, and decrypt them with the private key. Thus the
owner of the private key would be the only one who could decrypt the
messages, but anyone knowing the public key could send them in privacy.

Another idea that was observed was that of a key exchange. In a two-party
communication it would be useful to generate a common secret key for bulk
encryption using a secret-key cryptosystem (for example, some block
cipher).

Indeed, Whitfield Diffie and Martin Hellman used ideas from number theory
to construct a key exchange protocol that started the era of public-key
cryptosystems. Shortly after that Ron Rivest, Adi Shamir, and Leonard
Adleman developed a cryptosystem that was the first real public-key
cryptosystem capable of encryption and digital signatures.

Later several public-key cryptosystems followed using many different


underlying ideas (for example, knapsack problems, different groups on finite
fields, and lattices). Many of them were soon proven to be insecure.
However, the Diffie-Hellman protocol and RSA appear to have remained two
of the strongest up to now.

Terminology

The basic ingredient in any public-key cryptosystem is a difficult


computational problem. The security of the cryptosystem is based on the fact
that the private key can be computed from the public key only by solving this
difficult problem. We now introduce some relevant terminology used in
public-key cryptography.

• Algorithm. An algorithm is an explicit description of how a particular


computation should be performed (or a problem solved). The efficiency
of an algorithm can be measured as the number of elementary steps it
takes to solve the problem. Usually one has to settle for the
asymptotic running time, which can be expressed using the big -O
notation: we say that an algorithm is O(f(n)), if its worst-case running
time divided by f(n) is bounded by a fixed (positive) constant as the
input size n increases.

• Computational complexity. A problem is polynomial time or in P if it


can be solved by an algorithm which takes less than O(nt) steps,
where t is some finite number and the variable n measures the size of
the problem instance.

If a guessed solution to a problem can be verified in polynomial time


then the problem is said to be in NP (non-deterministic polynomial
time). The set of problems that lie in NP is very large, it includes the
problem of integer factorization.

A problem is NP-hard if there is no other problem in NP that is easier


to solve. There is no known polynomial time algorithm for any NP-
hard problem, and it is believed that such algorithms in fact do not
exist.

In public-key cryptography, the attacker is interested in solving


particular instances of a problem (factoring some given number),
rather than providing a general solution (an algorithm to factor any
possible number efficiently). This causes some concern for
cryptographers, as some instances of a problem that is NP-hard in
general may be easily solvable.

• Primes. A prime number is a number that has no divisors except for


itself and 1. Thus the integers 2,3,5,7,11,... and so on are primes.
There are infinitely many primes, and the record of the biggest known
prime number has been broken almost yearly in recent times.

• Factoring. Every integer can be represented uniquely as a product of


prime numbers. For example, 10 = 2 * 5 (the notation * is common
for multiplication in computer science) and it is unique (except for the
order of the factors 2 and 5). The art of factorization is almost as old
as mathematics itself. However, the study of fast algorithms for
factoring is only a few decades old.

One possible algorithm for factoring an integer is to divide the input by


all small prime numbers iteratively until the remaining number is
prime. This is efficient only for integers that are, say, of size less than
1016 as this already requires trying all primes up to 108. In public-key
cryptosystems based on the problem of factoring, numbers are of size
10300 and this would require trying all primes up to 10150 and there are
about 10147 such prime numbers according to the prime number
theorem. This far exceeds the number of atoms in the universe, and is
unlikely to be enumerated by any effort.

The easy instance of factoring is the case where the given integer has
only small prime factors. For example, 759375 is easy to factor as we
can write it as 35* 55. In cryptography we want to use only those
integers that have only large prime factors. Preferably we select an
integer with two large prime factors, as is done in the RSA
cryptosystem.

Currently one of the best factoring algorithms is the number field sieve
algorithm (NFS) that consists of a sieving phase and a matrix step.
The sieving phase can be distributed (and has been several times)
among a large number of participants, but the matrix step needs to be
performed on large supercomputers. The effectiveness of the NFS
algorithm becomes apparent for very large integers, it can factor any
integer of size 10150 in a few months time. The NFS algorithm takes
sub-exponential time (which is still not very efficient).

There is no known proof that integer factorization is an NP-hard


problem nor that it is not polynomial time solvable. If any NP-hard
problem were polynomial time solvable, then also factoring would, but
there is very little hope that this is the case. It is plausible under
current knowledge that factoring is not polynomial time solvable.

• Discrete logarithms. Another important class of problems is the


problem of finding n given only some y such that y = gn. The problem
is easy for integers, but when we are working in a slightly different
setting it becomes very hard.

To obscure the nature of n in gn, we divide the infinite set of integers


into a finite set of remainder classes. Intuitively, we take the string of
integers and wrap it on a circle (which has circumference of length m).

The numbers 0, m, 2m, 3m, ... all cover the same point on the circle,
and therefore are said to be in the same equivalence class (we also
write "0 = m = 2m = ... (mod m)"). Each equivalence class has a least
representative in 0 .. m-1. So you can write any integer n as t + km
for any integer t, where 0 <= t < m. It is a convention to write n = t
(mod m) in this case. Here m is said to be the modulus.

It can be shown that you can add, subtract and multiply with these
classes of integers (modulo some m).

This structure, when m = p with p a prime number, is often called a


prime field or a Galois field, GF(p). When m is prime, the division of
nonzero integer classes is well defined. According to the proper
mathematical terminology it is a finite field of characteristic p, where p
is the modulus. If m is not a prime number then the structure is called
a (finite) ring. More information about groups, fields and rings can be
read from any good elementary text in algebra.

The discrete logarithm problem in the finite field GF(p) is then stated
as follows:
given two positive non-zero integers a, g (both less than p), compute
n such that a = gn (mod p). We can choose g so that a solution for n
exists for any non-zero a. To make this problem cryptographically hard
p should be a large prime number (about 10300 and n, in general, of
same magnitude.

This problem is currently considered as hard as factoring. The best


method known at this time is the Number field sieve for discrete
logarithms (which uses similar ideas as the NFS for factoring). The
discrete logarithm problem may appear more complicated than integer
factoring, but in many ways they are similar. Many of the ideas that
work for factoring can be also applied in the setting of discrete
logarithms. There is little hope to find a polynomial time algorithm for
the computation of discrete logarithms in GF(p). In such a case it
would be likely that factoring problems also could be efficiently solved.

The discrete logarithm problem can be applied in many other settings,


such as elliptic curves. The discrete logarithm problem over elliptic
curves is commonly used in public-key cryptography. One reason for
this is that the the Number field sieve algorithm does not work here.
There are other methods for computing discrete logarithms over
elliptic curves but it appears even harder to solve the discrete
logarithm problem over elliptic curves than over GF(p). This has also
the effect that there are some key size benefits for using elliptic-curve-
based public-key cryptosystems as opposed to factoring-based
cryptosystems.

• Knapsacks. Given a small set of integers, the knapsack problem


consists of determining a subset of these integers such that their sum
is equal to a given integer. For example, given (2, 3, 5, 7) and 10, we
can find the solution 2 + 3 + 5 = 10, and thus solved the knapsack
problem, by brute-force search.

The general knapsack-problem is provably NP-hard, and thus appears


superior to factorization and discrete logarithm used in public-key
cryptosystems. Unfortunately, all cryptosystems that have used this
underlying idea have been broken - as the used instances of the
problem have not been really NP-hard.
• Lattices. Now we define a vector basis wi = < w1, ..., wn> for i = 1,
..., m, and the lattice that is generated by the basis. That is, elements
of the lattice are of the form
t1w1 + t2w2 + ... + tmwm, where ti are integers.

The problem of finding the shortest vector in a lattice (using the usual
Euclidean distance) is a NP-hard problem (for lattices of sufficiently
large dimension).

However, the celebrated LLL-algorithm by Lenstra, Lenstra, and


Lovasz computes an approximate solution in polynomial time. The
effectiveness of the LLL-algorithm comes from the fact that in many
cases approximate solutions are good enough, and that surprisingly
often the LLL-algorithm actually gives the shortest vector. Indeed, this
algorithm has been often used to break cryptosystems based on lattice
problems or knapsacks. It has been applied also to attacks against
RSA and DSS.

Practical cryptosystems

The wide interest in public-key cryptography has produced several practically


important cryptosystems. In the following they are listed in order of the
underlying problem.

As a basic guideline, a public-key cryptosystem is built from a difficult


problem as follows: take a difficult problem (for example, NP-hard) for which
you can find an instance that can be solved in polynomial time. To encrypt a
message, convert the message into such an easy instance of the difficult
problem, then use the public key to convert the easy problem into a difficult
one. The result is then sent to the recipient through an insecure channel. To
decrypt use the private key to convert the difficult problem into the easy
one and solve it to regain the message. All public-key cryptosystems use this
principle, although they differ significantly in the details (like the underlying
problem or the structure of the public and private keys).

For a good survey on appropriate key lengths see Lenstra and Verheul's
Selecting Cryptographic Key Sizes (appeared in Public Key Cryptography
2000). They present a complete analysis of key sizes for almost all
cryptosystems.

Below, along with each cryptosystem, you will find the current
recommendations for key sizes where appropriate. These recommendations
are not always equal to Lenstra's and Verheul's.

Factorization: RSA, Rabin


• RSA (Rivest-Shamir-Adleman) is the most commonly used public-key
algorithm. It can be used both for encryption and for digital
signatures. The security of RSA is generally considered equivalent to
factoring, although this has not been proved.

RSA computation takes place with integers modulo n = p * q, for two


large secret primes p, q. To encrypt a message m, it is exponentiated
with a small public exponent e. For decryption, the recipient of the
ciphertext c = me (mod n) computes the multiplicative reverse d = e-1
(mod (p-1)*(q-1)) (we require that e is selected suitably for it to exist)
and obtains cd = m e * d = m (mod n). The private key consists of n, p,
q, e, d (where p and q can be forgotten); the public key contains only
of n, e. The problem for the attacker is that computing the reverse d of
e is assumed to be no easier than factorizing n. More details are
available in many sources, such as in the Handbook of Applied
Cryptography.

The key size (the size of the modulus) should be greater than 1024
bits (i.e. it should be of magnitude 10300) for a reasonable margin of
security. Keys of size, say, 2048 bits should give security for decades.

Dramatic advances in factoring large integers would make RSA


vulnerable, but other attacks against specific variants are also known.
Good implementations use redundancy (or padding with specific
structure) in order to avoid attacks using the multiplicative structure of
the ciphertext. RSA is vulnerable to chosen plaintext attacks and
hardware and fault attacks. Also important attacks against very small
exponents exist, as well as against partially revealed factorization of
the modulus.

The proper implementation of the RSA algorithm with redundancy is


well explained in the PKCS standards (see definitions at RSA
Laboratories). They give detailed explanations about how to implement
encryption and digital signatures, as well as formats to store the keys.
The plain RSA algorithm should not be used in any application. It is
recommended that implementations follow the standard as this has
also the additional benefit of inter-operability with most major
protocols.

RSA is currently the most important public-key algorithm. It was


patented in the United States (the patent expired in the year 2000).

• The Rabin cryptosystem may be seen as a relative of RSA, although it


has a quite different decoding process. What makes it interesting is
that breaking Rabin is provably equivalent to factoring.
Rabin uses the exponent 2 (or any even integer) instead of odd
integers like RSA. This has two consequences. First, the Rabin
cryptosystem can be proven to be equivalent to factoring; second, the
decryption becomes more difficult - at least in some sense. The latter
is due to problems in deciding which of the possible outcomes of the
decryption process is correct.

As it is equivalent to factoring the modulus, the size of the modulus is


the most important security parameter. Moduli of more than 1024 bits
are assumed to be secure.

There are currently no standards for the Rabin algorithm but it is


explained in several books. The IEEE P1363 project might propose a
standard and thus make it more widely used.

The equivalence to factoring means only that being able to decrypt any
message encrypted by the Rabin cryptosystem enables one to factor
the modulus. Thus it is no guarantee of security in the strong sense.

• References:

o R. Rivest, A. Shamir, and L. M. Adleman: Cryptographic


Communications System and Method. US Patent 4,405,829,
1983.

o Alfred J. Menezes, Paul C. van Oorschot and Scott A. Vanstone:


Handbook of Applied Cryptography.

o Bruce Schneier: Applied Cryptography.

o Jennifer Seberry and Josed Pieprzyk: Cryptography: An


Introduction to Computer Security.

o Man Young Rhee: Cryptography and Secure Data


Communications.

o Hans Riesel: Prime Numbers and Computer Methods for


Factorization.

Discrete logs: Diffie-Hellman, ElGamal, DSS


• Diffie-Hellman is a commonly used protocol for key exchange.

In many cryptographical protocols two parties wish to begin


communicating. However, assume they do not initially possess any
common secret and thus cannot use secret-key cryptosystems. The
key exchange by the Diffie-Hellman protocol remedies this situation by
allowing the construction of a common secret key over an insecure
communication channel. It is based on a problem related to discrete
logarithms, namely the Diffie-Hellman problem. This problem is
considered hard, and it is in some instances as hard as the discrete
logarithm problem.

In Diffie-Hellman key exchange, the communicating parties, A and B,


first choose a prime p and a generator a of Zp*, 2<=a<=p-2 and
publish these. A chooses a random secret x, 1<=x<=p-2 and B
similarly chooses a secret y, 1<=y<=p-2. A sends to B the number ax
mod p and B sends to A ay mod p. The shared key will be K=axy mod p
which A computes as (ay)x and B as (ax)y. The adversary wishing to
obtain the secret key K now faces the Diffie-Hellman problem:
Compute K=axy mod p from the knowledge of ax and ay only (and not
x, y).

The Diffie-Hellman protocol is generally considered to be secure when


an appropriate mathematical group is used. In particular, the
generator element used in the exponentiations should have a large
period (i.e. order).

Discrete logarithm algorithms can be used to attack Diffie-Hellman,


and with passive attacks that is the best that is currently possible -
assuming correctly chosen parameters. If Diffie-Hellman is applied
with usual arithmetic modulo a prime number, then it suffices to select
a large enough prime and to take some care in selecting the generator
element. Subtle problems may be caused by bad choices of the
generator. Many papers have been written on the problems that may
occur, one of the more well-known references is Oorschot and
Wiener's On Diffie-Hellman key agreement with short exponents
(Eurocrypt'96).

Attacks against Diffie-Hellman include also the man-in-the-middle


attack. This attack requires adaptive intervention, but is in practice
very easy if the protocol doesn't use countermeasures such as digital
signatures.

Usually Diffie-Hellman is not implemented on hardware, and thus


hardware attacks are not an important threat. This may change in the
future, when mobile communication becomes more widespread.
• DSS (Digital Signature Standard). A signature-only mechanism
endorsed by the United States Government. The underlying algorithm
DSA (Digital Signature Algorithm) is similar to the one used by
ElGamal or by the Schnorr signature algorithm. Also it is fairly
efficient, although not as efficient as RSA for signature verification.
The standard defines DSS to use the SHA-1 hash function exclusively
to compute message digests.

The main problem with DSS is the fixed subgroup size (the order of
the generator element), which limits the security to around only 80
bits. Hardware attacks can be a concern to some implementations of
DSS. However, it is widely used and accepted as a good algorithm.

• The ElGamal public-key cipher. This is a straightforward extension of


Diffie/Hellman's original idea on shared secret generation. Essentially,
it generates a shared secret and uses it as a one-time pad to encrypt
one block of data. ElGamal is the predecessor of DSS and is perfectly
usable today, although no widely known standard has been created for
it.

• Elliptic curve cryptosystems are just another way of implementing


discrete logarithm methods. An elliptic curve is basically a set of points
that satisfy the equation y2 = x3 + ax + b when considered in a finite
field of characteristic p (where p must be larger than 3). A slightly
different equation is needed for the cases of small characteristic, p = 2
and p = 3.

The points on elliptic curves can be added together and they form a
structure called a group (in fact an abelian group). This is just a way of
saying that you can do arithmetic with them as you can do with
integers when using just addition and subtraction.

With regard to cryptography, elliptic curves have many theoretical


benefits but they are also very practical. There is no known sub-
exponential algorithm for computing discrete logarithms of points of
elliptic curves unlike for discrete logarithms in (the multiplicative group
of) a finite field, in hyperelliptic curves (of large genus) or in many
other groups. One practical benefit of the non-existence of a fast
discrete logarithm computation for elliptic curves is that the key size,
as well as the produced digital signatures and encrypted messages are
small. Indeed, a very simple way of computing the security limit for
the key size is to take a key size for a secret key cryptosystem in bits
and then just multiply it by 2. This gives a rough estimate, that is
good at the moment for a generic instance of elliptic curves.

Elliptic curves can be implemented very efficiently in hardware and


software, and they compete well in speed with cryptosystems such as
RSA and DSS. There are several standardization attempts for elliptic
curve cryptosystems (for example, ECDSA by ANSI). At the moment
elliptic curves are widely known, but not very widely used in practice.

The security of elliptic curve cryptosystems has been rather stable for
years, although significant advances have been achieved in attacks
against special instances. Nevertheless, these have been conjectured
by the leading researchers several years ago and no great surprises
have yet emerged.

The algorithm XTR recently introduced by Lenstra and Verheul might


become a good competitor for elliptic curves. However, elliptic curves
appear to be slightly better in performance, and definitely scale better
in the key size.

• LUC is a public-key cryptosystem that uses a special group based on


Lucas sequences (related to Fibonacci series) as its basic building
block. It can implement all the common discrete-logarithm-based
algorithms, and in a sense LUC is a class of public-key algorithms.

It is possible to view the underlying structure of the algorithm as a


certain multiplicative group of a finite field of characteristic p with
degree 2 (written shortly as Fp2) - and this can be used to prove that
there exists a sub-exponential algorithm for computing discrete
logarithms and thus attacking the LUC algorithm. Thus it might seem
that LUC is of little interest, as it is basically just another way of
implementing discrete-logarithm-based algorithms on finite fields.
However, LUC uses the specific arithmetic operations derived from
Lucas sequences that might be slightly faster (by a constant factor)
than what would be a more direct approach.

The different public-key algorithms based on LUC arithmetic are called


LUCDIF (LUC Diffie-Hellman), LUCELG (LUC ElGamal), and LUCDSA
(LUC Digital Signature Algorithm). Several of these are patented.

The fact that values used in the LUC algorithms can be represented as
a pair of values gives some additional advantage against just using
integers modulo p. The computations only involve numbers needing
half the bits that would be required in the latter case. As the LUC
group operation is easy to compute this makes LUC algorithms
competitive with RSA and DSS.

However, at present there appears to be little reason to use LUC


cryptosystems, as they offer little benefit over elliptic curves or XTR.
• XTR is a public-key cryptosystem developed by Arjen Lenstra and Eric
Verheul. XTR is similar to LUC in that it uses a specific multiplicative
group of a particular finite field (in fact Fp6) as its underlying group.

However, XTR has novel features such as needing only approximately


1/3 of the bits for signatures and encrypted messages. This is
achieved using a specific trace-representation of the elements of this
group, and performing all computations using this representation.

All discrete-logarithm-based public-key algorithms can be


implemented with XTR ideas. So in a way XTR is a generic name for a
class of public-key algorithms, similarly to LUC.

Perhaps surprisingly, the algorithm is efficient and according to Lenstra


and Verheul it might be a good substitute to elliptic curves, DSS and
even RSA. It has the advantage over elliptic curves that it is based
essentially on the same discrete log problem as, say, DSS, which may
help cryptographers and others to accept it faster as a strong
algorithm.

• References:

o Alfred J. Menezes, Paul C. van Oorschot and Scott A. Vanstone:


Handbook of Applied Cryptography.

o Alfred J. Menezes: Elliptic curve


cryptosystems.

o IEEE P1363: Standard Specifications


for Public-Key Cryptography.

o Neil Koblitz: A Course in Number Theory and


Cryptography.

o Ian Blake, Gadien Seroussi and Nigel Smart: Elliptic Curves in


Cryptography.

o Bruce Schneier: Applied


Cryptography.

Knapsacks

There are only a few interesting knapsack public-key cryptosystems, none of


which are of practical importance.
• Rivest-Chor cryptosystem is based on a particular variant of the
knapsack problem. This is one of the knapsack cryptosystems that has
best resisted attacks.

• Merkle-Hellman. This was the first knapsack cryptosystem. It was


based on the simple idea of hiding the easy super-increasing knapsack
problem by masking. However, it was broken already in 1980.

• References:

o M. E. Hellman and R. C. Merkle: Public Key Cryptographic


Apparatus and Method. US Patent 4,218,582, 1980.

o B. Chor and R.L. Rivest: A knapsack type public key


cryptosystem based on arithmetic in finite field, Crypto '84.

Lattices

In recent years large interest has been directed towards lattice-based


cryptosystems. One of the reasons is that certain classes of lattice problems
are NP-hard, and several efficient cryptosystems have been proposed and
appear strong.

• NTRU is a cryptosystem proposed in mid-1990s as an efficient public-


key cipher. It is based on the lattice problem, and has some
interesting features.

Some of the initial versions had problems, but the current version has
been proposed for some US standards.

• References:

o D. Coppersmith. and A. Shamir: Lattice Attacks on NTRU,


Eurocrypt '97.

o NTRU Cryptosystems Inc. http://www.ntru.com.


o IEEE P1363: Standard Specifications for Public-Key
Cryptography.

Secret-Key Cryptosystems (Symmetric Ciphers)

Secret key algorithms use the same key for both encryption and decryption
(or one is easily derivable from the other). This is the more straightforward
approach to data encryption, it is mathematically less complicated than
public key cryptography, and has been used for many centuries.

Terminology

The following terminology is often used when symmetric ciphers are


examined.

• Block ciphers: A block of cipher transforms n-bit plaintext blocks to


n-bit ciphertext blocks under the influence of a cipher key k. In general
the key is assumed to be chosen at random mathematically , for a
fixed key, a block cipher is a bijection, defining a permutation of n-bit
vectors. Each key potentially (but not necessarily) defines a different
bijection. For plaintext messages exceeding one block in length,
various modes of operation for block ciphers are used.

• Stream ciphers. A stream cipher consists of a state machine that


outputs at each state transition one bit of information. This stream of
output bits is commonly called the running key. The encryption can be
implemented by just exclusively-oring the running key to the plaintext
message.

The state machine is nothing more than a pseudo-random number


generator. For example, we can build one from a block cipher by
encrypting repeatedly its own output. Typically, more elaborate
constructions are used for stream ciphers to obtain high-speed.

Some of the more well-known stream ciphers are RC4 and SEAL.
Several stream ciphers are based on linear-feedback shift registers
(LFSR), such as A5/1 used in GSM. These have the benefit of being
very fast (several times faster than usual block ciphers).

• SSSC (self-synchronizing stream ciphers): The class of self-


synchronizing stream ciphers has the convenient property that it
corrects the output stream after bit flips or even dropped bits after a
short time (say, a few hundred bits).

SSSCs can be constructed using block ciphers in a CFB-related way.


Assume that we have already encrypted n bits of the message and
know that much of the ciphertext (where n denotes the block length of
the cipher). Then we produce a new running key (see above) bit by
encrypting the n ciphertext bits. Take one bit of the output of the
cipher to be the new running key bit. Now moving one bit further we
can iterate this procedure for the whole length of the message.

It is easy to see that a one bit error in a ciphertext cannot affect the
decrypted plaintext after n bits. This makes the cipher self-
synchronizing.

The block cipher used should have sufficiently large block size to avoid
substitution attacks, for example.

• Substitution and transposition: Substitution means replacing


symbols or group of symbols by other symbols or groups of symbols.
Transposition, on the other hand, means permuting the symbols in a
block. Neither of these operations alone, provides sufficient security,
but strong ciphers can be built by combining them. By a Substitution-
Permutation Network (SPN), we mean cipher composed of a number of
stages, each involving substitutions and permutations. Well known
examples of SPNs are DES and CAST-128.

• S-boxes: lookup tables that map n bits to m bits (where n and m


often are equal).

There are several ways of constructing good S-boxes for ciphers, as


well as several ways of measuring them. Some designers use a
rigorous mathematical approach by using bent functions (or related) to
create S-boxes which can be proven to be resistant against some
particular attacks. Other designers use heuristic approaches, which
may lead to S-boxes that are more difficult to handle in mathematical
proofs, but can have additional benefits (such as several different
implementation options).

The S-box may even be the only non-linear part of the cipher. This is
the case in the block cipher DES, and thus may be regarded as the
single most important part of the algorithm. In fact, many consider
DES's S-boxes so good that they use them in their own designs (for
example, Serpent).

• Feistel networks: a Feistel network is a general way of constructing


block ciphers from simple functions. The original idea was used in the
block cipher Lucifer, invented by Horst Feistel. Several variations have
been devised from the original version.
Put simply, the standard Feistel network takes a function from n bits to
n bits and produces an invertible function from 2n bits to 2n bits. The
basic function upon which the structure is based is often called the
round function. The essential property of Feistel networks that makes
them so useful in cipher design is that the round function need not be
invertible, but the resulting function always is.

If the round function depends on, say, k bits of a key, then the Feistel
cipher requires rk bits of the key where r is the number of rounds
used.

The security of the Feistel structure is not obvious, but analysis of DES
has shown that it is a good way to construct ciphers. It is compulsory
that a Feistel cipher has enough rounds, but just adding more rounds
does not always guarantee security.

• The operation of taking the user key and expanding it into rk bits for
the Feistel rounds is called key scheduling. This is often a non-linear
operation, so that finding out any of the rk bits of the round keys does
not directly provide any information about the actual user key. There
are many ciphers that have this basic structure; Lucifer, DES, and
Twofish, just to name a few.

• Expansion, Permutation: these are common tools in mixing bits in a


round function. They are linear operations, and thus not sufficient to
guarantee security. However, when used with good non-linear S-boxes
(as in DES) they are vital for the security because they propagate the
non-linearity uniformly over all bits.

• bitslice operations (bitwise logic operations XOR, AND, OR, NOT and
bit permutations): The idea of bitslice implementations of block ciphers
is due to Eli Biham. It is common practice in vector machines to
achieve parallel operation. However, Biham applied it on serial
machines by using large registers as available in modern computers.
The term "bitslice" is due to Matthew Kwan.

Basically all block ciphers can be written in bitslice manner, but


operations such as addition and multiplication may become very slow.
On the other hand, permutations are almost free as they just require
naming of the registers again and this can be done one the coding
level. Thus, for example, in DES exhaustive key search using bitslice
techniques, one can increment the current key in a fraction of the time
than is usually needed for key scheduling.

The AES finalist Serpent is designed to be implemented using bitslice


operations only. This makes it particularly efficient on modern
architectures with many registers.
Modes Of Operation

Many commonly used ciphers are block ciphers. Block ciphers transform a
fixed-size block of data (usually 64 bits) it into another fixed-size block
(possibly 64 bits wide again) using a function selected by the key. If the key,
input block and output block have all n bits, a block cipher basically defines a
one-to-one mapping from n-bit integers to permutations of n-bit integers.

If the same block is encrypted twice with the same key, the resulting
ciphertext blocks are also the same (this mode of encryption is called
electronic code book, or ECB). This information could be useful for an
attacker. To cause identical plaintext blocks being encrypt to different
ciphertext blocks, three standard modes are commonly used:

• CBC (cipher block chaining): a ciphertext block is obtained by first


XORing the plaintext block with the previous ciphertext block, and
encrypting the resulting value. This way leading blocks influence all
trailing blocks, which increases the number of plaintext bits one
ciphertext bit depends on, but also leads to synchronization problems
if one block is lost.

• CFB (cipher feedback): the kth ciphertext block is obtained by


encrypting the (k-1)th ciphertext block and XORing the result onto the
plaintext. The CFB mode has a self-synchronizing property: as soon as
n (the block size) lost bits of ciphertext have been received correctly,
the decryption will be correct, regardless of possible earlier
transmission errors. Interestingly, a CFB feedback loop can also be
used as a pseudo-random number generator if one simply feeds one
block of true random data with trailing blocks of zeroes into the
encryption routine (although the expected period of this PRNG would
be only about 2n/2 where n is the block size of the cipher).

• OFB (output feedback): the cipher is used as a synchronous key-


stream generator, whose output is XORed with the plaintext to obtain
ciphertext, block by block. The key-stream is produced by iterative
encryption, starting with an initialization vector which, along with the
key, is agreed upon beforehand.

More information on cipher modes can be found in the Handbook of Applied


Cryptography by Menezes et al.
The One-Time Pad

The one-time pad (OTP) is the only cipher that has been proven to be
unconditionally secure, i.e. unbreakable in practice. It has also be proven
that any unbreakable, unconditionally secure, cipher must in principle be a
one-time pad.

The Vernam cipher (invented by G. Vernam in 1917) is a famous instance of


an OTP. This cipher is very simple: take a stream of bits that contains the
plaintext message, and a secret random bit-stream of the same length as the
plaintext which is considered the key. To encrypt the plaintext with the key,
sequentially exclusive-or each pair of key bit and plaintext bit to obtain the
ciphertext bit. If the key is truly random, it can be proven that the attacker
has no means of deciding whether some guessed plaintext is more likely than
any other when having only the ciphertext and no knowledge of the
plaintext.

The practical problem is that the key does not have small constant length,
but the same length as the message, and one part of a key should never be
used twice (or the cipher can be broken). So, we just have traded the
problem of exchanging secret data for the problem of exchanging secret
random keys of the same length. However, this
cipher has allegedly been in widespread use since its invention, and even
more since the security proof by C. Shannon in 1949. Although admittedly
the security of this cipher had been conjectured earlier, it was Shannon who
actually found a formal proof for it.

More information can found, for example, in the book by D. Stinson


Cryptography: Theory and Practice.

DES

The Data Encryption Standard (DES) is an algorithm developed in the mid-


1970s. It was turned into a standard by the US National Institute of
Standards and Technology (NIST), and was also adopted by several other
governments worldwide. It was and still is widely used, especially in the
financial industry.

DES is a block cipher with a 64-bit block size. It uses 56-bit keys. This makes
it suspectible to exhaustive key search with modern computers and special-
purpose hardware. DES is still strong enough to keep most random hackers
and individuals out, but it is easily breakable with special hardware by
government, criminal organizations, or major corporations. DES is getting
too weak, and should not be used in new applications. As a consequence,
NIST proposed in 2004 to withdraw the DES standard.

A variant of DES, Triple-DES (also 3DES) is based on using DES three times
(normally in an encrypt-decrypt-encrypt sequence with three different,
unrelated keys). The Triple-DES is arguably much stronger than (single)
DES, however, it is rather slow compared to some new block ciphers.

Nevertheless, even though DES seems to be of little interest for applications


of today there are many reasons for considering it still important. It was the
first block cipher which was widely deployed in the public sector. Thus it
played an important role in making strong cryptography available to the
public.

Also, the design was exceptionally good for a cipher that was meant to be
used only a few years. DES proved to be a very strong cipher and it took
over a decade for any interesting cryptanalytical attacks against it to develop
(not to underestimate the pioneering efforts that lead to this breakthrough).
The development of differential cryptanalysis and linear cryptanalysis opened
ways to really understand the design of block ciphers.

Although at the time of DES's introduction its design philosophy was held
secret, it did not discourage its analysis - to the contrary. Some information
has been published about its design, and one of the original designers, Don
Coppersmith, has commented that they discovered ideas similar to
differential cryptanalysis already while designing DES in 1974. However, it
was just a matter of time that these fundamental ideas were re-discovered.

Even today, when DES is no longer considered a practical solution, it is often


used to describe new cryptanalytical techniques. It is remarkable that even
today, there are no cryptanalytical techniques that would completely break
DES in a structural way, indeed, the only real weakness known is the short
key size (and perhaps the small block size).

AES

In response to the growing feasibility of attacks against DES, NIST launched


a call for proposals for an official successor that meets 21st century security
needs. This successor is called the Advanced Encryption Standard (AES).
Five algorithms made it into the second round, from which Rijndael was
selected to be the final standard. We will now have a quick look at each of
them and their cryptographic peculiarities.

All the ciphers have a 128-bit block size and they support 128, 192, and 256
bit keys. The rather large key sizes are probably required to give means for
construction of efficient hash functions.

• The AES (Rijndael), the design of two Belgian cryptographers, Joan


Daemen and Vincent Rijmen. A thorough exposition of he Rijndael
cipher can be found in the book written by the designers (see the
Books page).

Rijndael follows the tradition of square ciphers (it is based on ideas


similar to the Square cipher). NIST gave as its reasons for selecting
Rijndael that it performs very well in hardware and software across a
wide range of environments in all possible modes. It has excellent key
setup time and has low memory requirements, in addition its
operations are easy to defend against power and timing attacks.

NIST stated that all five finalists had adequate security and that there
was nothing wrong with the other four ciphers. After all analysis and
received comments were considered, NIST considered Rijndael the
best choice for the AES. The other four finalists are mentioned below.

• MARS by Zunic et al., IBM.

This is an interesting new design (using a special type of a Feistel


network), which depends heavily on the instruction sets available on
modern 32-bit processors. This has the benefit that on these target
machines it is efficient, but it may lead to implementation difficulties in
cheaper architectures like smart cards.

• RC6 by Rivest, Robshaw, and Yin, RSA Laboratories.

RC6 follows the ideas of RC5 - but with many improvements. For
example, it attempts to avoid some of the differential attacks against
RC5's data dependent rotations. However, there are some attacks that
get quite far, and it is unclear whether RC6 is well enough analyzed
yet.

• Serpent by Anderson, Biham, and Knudsen.

Serpent has a basically conservative but in many ways innovative


design. It may be implemented by bitslice (or vector) gate logic
throughout. This makes it rather complicated to implement from
scratch, and writing it in a non-bitslice way involves an efficiency
penalty.

The 32 rounds lead to probably the highest security margin on all AES
candidates, while it is still fast enough for all purposes.

• Twofish by Schneier et al., Counterpane Security.

Twofish is a new block cipher designed by Counterpane (whose


founder and CTO is Bruce Schneier). The design is highly delicate, with
many alternative ways of implementation. It is cryptanalysed in much
detail, by the very authoritative "extended Twofish team". It is
basically a Feistel cipher, but utilizes many different ideas.

This cipher has key dependent S-boxes like Blowfish (another cipher
by Bruce Schneier).

Other Symmetric Ciphers

Blowfish

Blowfish was designed by Bruce Schneier. It is a block cipher with a 64-bit


block size and variable length keys (up to 448 bits). It has gained a fair
amount of acceptance in a number of applications, including Nautilus and
PGPfone.

Blowfish utilizes the idea of randomized S-boxes: while doing key scheduling,
it generates large pseudo-random lookup tables by doing several
encryptions. The tables depend on the user supplied key in a very complex
way. This approach has been proven to be highly resistant against many
attacks such as differential and linear cryptanalysis. Unfortunately it also
means that it is not the algorithm of choice for environments where large
memory space (something like 4096 bytes) is not available..

The only known attacks against Blowfish are based on its weak key classes.

CAST-128
CAST-128 is a DES-like Substitution-Permutation Network (SPN)
cryptosystem which appears to have good resistance to differential, linear
and related-key cryptanalysis. It has the Feistel structure and utilizes eight
fixed S-boxes. CAST-128 supports variable key lenghts between 40 and 128
bits. It is described in RFC 2144.

IDEA

IDEA (International Data Encryption Algorithm) is an algorithm developed at


ETH Zurich in Switzerland by Xuejia Lai. It uses a 128-bit key, and it is
generally considered to be very secure. It has been one of the best publicly
known algorithms for some time (before the AES standard started its second
round, see above). It has been around now for several years, and no
practical attacks on it have been published despite numerous attempts to
analyze it.
One of the best attacks uses the impossible differential idea of Biham,
Shamir and Biryukov. They can attack only 4.5 rounds of IDEA and this
poses no threat to the total of 8.5 rounds used in IDEA.

IDEA is patented in the United States and in most European countries.

Rabbit (CryptiCore)
Rabbit is a recent stream cipher, based on iterating a set of coupled
nonlinear functions. It is characterized by a high performance in software. A
detailed description and security analysis of Rabbit is available from the
designers' web site (www.cryptico.com).

RC4

RC4 is a stream cipher designed by Ron Rivest at RSA Data Security, Inc. It
used to be a trade secret, until someone posted source code for an algorithm
on the usenet, claiming it to be equivalent to RC4. There is very strong
evidence that the posted algorithm is indeed equivalent to RC4. The
algorithm is very fast. Its security is unknown, but breaking it does not seem
trivial either. Because of its speed, it may have uses in certain applications.
It accepts keys of arbitrary length.

RC4 is essentially a pseudo random number generator, and the output of the
generator is exclusive-ored with the data stream. For this reason, it is very
important that the same RC4 key never be used to encrypt two different data
streams.

SEED
SEED is a block cipher developed by the Korea Information Security Agency
since 1998. Both the block and key size of SEED are 128 bits and it has a
Feistel Network structure which is iterated 16 times. It has been designed to
resist differential and linear cryptanalysis as well as related key attacks.
SEED uses two 8x8 S-boxes and mixes the XOR operation with modular
addition. The algorithm specification is available through the SEED
homepage. SEED has been adopted as an ISO/IEC standard (ISO/IEC 18033-
3), an IETF RFC, RFC 4269 as well as an industrial association standard of
Korea (TTAS.KO-12.0004/0025).

Cryptography before the 1970s

In this section some of the famous ciphers of the past are listed, with links to
more complete information where possible.
• Fish was used by the German army in WWII to encipher high-
command communications. It was produced by a stream cipher called
the Lorentz machine. Fish was the name given to it by British
cryptanalysts. It was important because it caused difficulties for British
analysts, who finally developed a machine called Colossus, which was
arguably the first, or one of the first, digital computers.

The Colossus machine may have been an important factor in the


planning and success of the Allied attack on Normandy. Given the
intelligence produced by cryptanalysis of Fish, Allied forces knew the
positions of pretty much every German division.

• Enigma was another cipher used by the Germans in World War II. The
machine used several rotors and looked like a typing machine.
However, first Polish and then later British mathematicians were able
to keep up with the development of the machine. Most communication
using the basic version of Enigma was deciphered by British analysts
at Bletchley Park within few hours of the interception. One of the
strongest Enigma's were used in submarine communication, but British
analysts managed to break them with great implications for the battle
on the Atlantic.

There are several good books about Enigma and Bletchley Park. In
addition, the work of the major figure of British cryptanalysis, Alan
Turing, has been explained in many articles and books. Recently his
original notes about cryptanalysis from that time has been released for
the public.

This cipher, or a variant of it, is used by the unix crypt(1) program. It


is unlikely that any variant of Enigma could be considered very secure
by todays standards.

• Vernam cipher is described in detail above.

• Substitution cipher. This is one of the basic pencil-and-paper


methods. Make a table by first listing your alphabet in order on the
first row, and then making a random permutation of the alphabet on
the second row. You can then encrypt any character of the alphabet by
first looking it up on the first row, and the writing down the random
character on the second row. The key of this method is the
permutation of the alphabet on the second row. Decryption works in
reverse.

This method is suspectible to frequency analysis, as long as the


alphabet size is small. Modern block ciphers can be seen as a variant
of this idea, in the sense that they try hide the message under a very
large alphabet that depends on the key. In the block cipher case the
permutation is generated by the secret key and the key space might
not cover all the possible permutations.

• Vigenere. This cipher uses clock arithmetic to add together the key
and the message. The difference between OTP and Vigenere is that in
Vigenere we explicitly reuse the short key several times for one
message.

Methods for attacking Vigenere ciphers are the Kasiski test, index of
coincidence etc. These lead to effective methods which break even
very short message (relative to the key size of course).

• Hill cipher. The Hill cipher uses matrices in clock arithmetic, and is
highly suspectible to known plaintext attacks.

Cryptographic Hash Functions

Cryptographic hash functions are used in various contexts, for example to


compute the message digest when making a digital signature. A hash
function compresses the bits of a message to a fixed-size hash value in a
way that distributes the possible messages evenly among the possible hash
values. A cryptographic hash function does this in
a way that makes it extremely difficult to come up with a message that
would hash to a particular hash value. Some of the best known and most
widely used hash functions are briefly described below.

• SHA-1 (Secure Hash Algorithm) (also SHS, Secure Hash Standard):


This is a cryptographic hash algorithm published by the United States
Government. It produces a 160 bit hash value from an arbitrary length
string. SHA-1 is still considered adequately safe for practical
applications, but stronger versions, SHA-256, SHA-384, and SHA-512,
which produce 256-, 384-, and 512-bit hashes, respectively, are
available and will replace SHA-1 as cryptological research on it
advances.

The official standard text can be found in FIPS PUB 180-1.

• RIPEMD-160 is a hash algorithm designed to replace MD4 and MD5


(see below). It produces a digest of 20 bytes (160 bits, hence the
name) and has been placed in the public domain by its designers. The
RIPEMD-160 homepage is at
www.esat.kuleuven.ac.be/~bosselae/ripemd160.html
• MD5 (Message Digest Algorithm 5) is a cryptographic hash algorithm
developed at RSA Laboratories. It can be used to hash an arbitrary
length byte string into a 128-bit value.

MD5's ancestors, MD2 and MD4 have been broken, and there are some
concerns about the safety of MD5 as well. In 1996 a collision of the
MD5 compression function was found by Hans Dobbertin (Hans
Dobbertin, FSE'96, LNCS 1039). Although this result does not directly
compromise its security, as a precaution the use of MD5 is not
recommended in new applications.

• Badger (CryptiCore) is a recent message authentication code which


was first presented at the Applied Cryptography and Network Security
Conference (ACNS) in 2005. It uses universal hashing as its design
principle, allowing for a rigorous proof of its security under
randomness assumptions for the keys. In addition, it uses some novel
ways of combining the ingredients to make it one of the fastest known
MAC algorithms. Bagder was developed by Cryptico and a detailed
description as well as security analysis is available from Cryptico's web
site.

Random Number Generators

Cryptographic systems need cryptographically strong (pseudo) random


numbers that cannot be guessed by an attacker. Random numbers are
typically used to generate session keys, and their quality is critical for the
quality of the resulting systems. The random number generator is easily
overlooked, and can become the weakest point of the cryptosystem.

Some machines may have special purpose hardware noise generators. Noise
from the leak current of a diode or transistor, least significant bits of audio
inputs, times between interrupts, etc. are all good sources of randomness
when processed with a suitable cryptographical hash function. It is a good
idea to acquire true environmental noise whenever possible.

One cryptographical pseudo random number generator is Yarrow by


Counterpane. The ANSI X9.17 standard defines another one, which is based
on the triple-DES encryption algorithm. This function is widely used and it
has been FIPS-approved by the US government. Another well known pseudo
random number generator, used to produce, for example, DSA private keys,
is defined in the FIPS 186 standard. A good page to search for further
material on (statistical) pseudo-random number generators is the pLab site.
Any cryptographically good pseudo-random number generator should pass all
the basic tests for statistical randomness.
PROTOCOLS AND STANDARDS

This section describes software protocols used in various applications. These


protocols are built on lower-level cryptographic algorithms, as described in
the algorithms page.

Cryptography works on many levels. On one level you have algorithms, such
as symmetric block ciphers and public-key algorithms. Building upon these
you obtain protocols, and building upon protocols you find applications (or
other protocols).

It is not sufficient to study the security of the underlying algorithms alone, as


a weakness on a higher-level protocol (or application) can render the
application insecure regardless of how good the underlying cryptographic
algorithms are. A simple example is a protocol that leaks information about
the key being used to encrypt the communication channel. Irrespective of
how good the encryption algorithms are, they are rendered insecure if the
overlying protocol reveals information on the keys used in encryption.

The analysis of protocols is often difficult because the applications that


implement such protocols may lead to further problems. Thus a good
protocol is not enough, you must also have a good and robust
implementation.

In the following, several well-known protocols and standards are mentioned.

• Domain Name Server Security (DNSSEC).

This is a protocol for secure distributed name services. It is defined in


RFC 3007 and RFC 3008.

• Generic Security Services API (GSSAPI)

GSSAPI provides an authentication, key exchange, and encryption


interface to different cryptographic algorithms and systems. It is
defined in RFC 2743.

• Secure Socket Layer (SSL) / Transport Layer Security (TLS)

SSL is one of the two protocols for secure WWW connections (the
other is SHTTP). WWW security has become important as increasing
amounts of sensitive information, such as credit card numbers, are
being transmitted over the Internet.
SSL was originally developed by Netscape in 1994 as an open protocol
standard. The internet draft of the SSL Protocol 3.0 can be found here.
In 1996, SSL development became the responsibility of the Internet
Engineering Task Force (IETF) and it renamed SSL to TLS (Transport
Layer Security). However, TLS 1.0 differs very little from SSL 3.0.
Extensions for TLS are described in RFC 3546.

• Secure Hypertext Transfer Protocol (SHTTP)

This is another protocol for providing more security for WWW


transactions. In many ways it is more flexible than SSL, but due to
Netscape's original dominance in the marketplace SSL/TLS is in a very
strong position. SHTTP is defined in RFC 2660.

• E-Mail security and related services


o OpenPGP is a standardization of what Phil Zimmermann's PGP
already did for many years. But now that it is a standard,
different implementations come into existence.

o Secure-MIME (S/MIME) is an alternative for the OpenPGP


standard maintained by the IETF working group S/MIME.

• Public Key Encryption Standards (PKCS)

These standards are developed at RSA Data Security and define safe
ways to use RSA. The PKCS documents published by RSA Laboratories
are available at their web site.

• IEEE P1363: Standard Specifications for Public-Key


Cryptography

A (upcoming) standard for public key cryptography. Consists of several


public key algorithms for encryption and digital signatures. Has a very
thorough annex that goes through all necessary implementation
details. See more at their homepage.

• Publius Censor-Resistent Publishing Protocol

This is a very advanced system allowing a group of authors and


readers to share documents on a set of web servers so that (1) no
author or reader has to reveal his/her identity, (2) documents are
certified to come from a certain (pseudonymous) author, (3)
documents cannot be removed or modified (censored) unless a large
amount of the involved web servers are compromised. Technical
reports, software, and links to related projects can be found on their
homepage.

• Secure Shell

The Secure Shell version 2 protocol is developed by the IETF working


group SecSh. This protocol is versatile for the needs of the Internet,
and is currently used in the SSH Tectia software. The protocol is used
to secure terminal sessions and arbitrary TCP connections. The Secure
Shell v2 protocol is based on its predecessor Secure Shell v1
developed by Tatu Ylönen. The protocol specifications (RFCs 4250 to
4256) can be found from the IETF.

• IPSec

While all the above protocols operate on the application layer of the
internet, allowing particular programs to communicate on a secure
channel in an inherently insecure network, IPSec attempts to make the
internet secure in its essence, the internet protocol (IP). The IPSec
protocols are defined in RFC 2401.

Next

SECTION 3

RESEARCH/WORK DONE BY GROUP

Appendix 2(1) – Key Generation Chamber General Specifications


Appendix 2(2) – Crypto Algorithm Technical Specifications
Appendix 2(3i) – Key Generation Architecture

You might also like