Project Front Pages
Project Front Pages
Project Front Pages
Project Report On
Optimization of DES
algorithm using OpenCL
(A Parallel Project)
Guided by
Ms. Nipuna Sri
Presented by
Rajesh Kumar Patel
Sujeet Kumar Singh
PRN: 140850122029
PRN: 140850122037
Signature
(Mr./Ms.
Project Guide
Signature
(Ms
)
Course Coordinator
Table of Contents
iii
1. Introduction
1.1. Abstract
1.2. Project Scope
1.3. Schedule
2. Requirement Specification & Methodologies
2.1. System Requirements
2.2. DES Algorithm
2.3. Parallel Programming
2.4. Why do we need Parallel Programming?
2.5. OpenCL
3. Code Implementation 8
3.1. Sequential Code 8
3.2 Parallel Code 20
4. Result
5. Graphical Analysis
6. Conclusion
7. References
Introduction
This work is concerned with the development of a fast DES bit slice brute force software
which utilize consumer Graphics Processing Units (GPUs) and shows improved performance
over existing implementations. The use of modern GPUs in high-performance computing is a
new trend, where such devices may be useful for offloading computationally intensive tasks
for achieving a significant performance boost in comparison with traditional use of general
purpose Central Processing Units (CPUs). Programming GPUs are supported by new
programming models based on the C language, e.g., the most widely used vendor-specific
OpenCL standard. The Data Encryption Standard (DES) was chosen as a case study because
the block-cipher uses permutations and substitutions of data, rather than the arithmetic
calculations which GPUs are known to excel in. The goal is to evaluate the potential of GPUs
for this type of application. Furthermore, the DES cipher has a limited 56 bit key space,
which has been successfully cracked by on FGPAs. However, FGPAs are much more
expensive than GPUs and requires much more programming effort in comparison with CUDA
and OpenCL.
Modern GPUs can be attractive for parallel processing because these architectures by
design have hundreds of processing cores and have high on-chip bandwidth close to one
order in magnitude larger than modern CPUs. These GPUs have good support for hiding
latency in memory transactions through massive multithreading with low context switch
overhead. The processing of instructions in the thread contexts is based on the Single
Instruction Multiple Data (SIMD) processing paradigm and is therefore suitable for algorithms
that can expose a high degree of data parallelism.
Abstract
The Data Encryption Standard (DES) was chosen as a case study because the block-cipher
uses permutations and substitutions of data, rather than the arithmetic calculations which
GPUs are known to excel in. The goal is to evaluate the potential of GPUs for this type of
application. Furthermore, the DES cipher has a limited 56 bit key space, which has been
successfully cracked by on FGPAs. However, FGPAs are much more expensive than GPUs and
requires much more programming effort in comparison with CUDA and OpenCL. A bit sliced
implementation of DES was initially considered to be a suitable candidate algorithm for
implementation on GPUs. The bit slice method is an emulated SIMD, that utilizes the n-bit
registers as a slice of the data vector, making it possible to permute n bits per operation.
Our tool is based on highly optimized lookup tables called Substitution BOXes (SBOXs). The
nonlinear SBOXs are converted from a lookup table to pure logic, which on average requires
56 operations. Thus, this is much faster than cutting the distinct key values from the slice
and sending them through a lookup table thereby substituting excessive high-latency data
transfers with bitwise operations enabling fast processing.
Scope
In general if we implement any complex problem, our expectations will be with the throughput and time taken to
generate the output. In the same scenario his implementation provides the maximum speed up of the process and
reduction in the time complexity. This engagement can generate the more promising results by implementing on
number of cores i.e. by using Parallel Programming methodology.
Schedule
This Project began on December 28, 2014 and continued till January 28, 2015 with the implementation of the
Algorithm.
DES Algorithm
As mentioned earlier there are two main types of cryptography in use today - symmetric
or secret key cryptography and asymmetric or public key cryptography. Symmetric
key cryptography is the oldest type whereas asymmetric cryptography is only being
used publicly since the late 1970s1. Asymmetric cryptography was a major milestone
in the search for a perfect encryption scheme.
Secret key cryptography goes back to at least Egyptian times and is of concern here.
It involves the use of only one key which is used for both encryption and decryption
(hence the use of the term symmetric). Figure 2.1 depicts this idea. It is necessary for
security purposes that the secret key never be revealed.
of Standards (NBS)3. It was finally adopted in 1977 as the Data Encryption Standard DES (FIPS PUB 46).
Some of the changes made to LUCIFER have been the subject of much controversy
even to the present day. The most notable of these was the key size. LUCIFER used
a key size of 128 bits however this was reduced to 56 bits for DES. Even though DES
actually accepts a 64 bit key as input, the remaining eight bits are used for parity
checking and have no effect on DESs security. Outsiders were convinced that the 56
bit key was an easy target for a brute force attack4 due to its extremely small size. The
need for the parity checking scheme was also questioned without satisfying answers.
Another controversial issue was that the S-boxes used were designed under classified
conditions and no reasons for their particular design were ever given. This led people
to assume that the NSA had introduced a trapdoor through which they could decrypt
any data encrypted by DES even without knowledge of the key. One startling discovery
was that the S-boxes appeared to be secure against an attack known as Differential
Cryptanalysis which was only publicly discovered by Biham and Shamir in 1990.
This suggests that the NSA were aware of this attack in 1977; 13 years earlier! In fact
the DES designers claimed that the reason they never made the design specifications for
the S-boxes available was that they knew about a number of attacks that werent public
knowledge at the time and they didnt want them leaking - this is quite a plausible
claim as differential cryptanalysis has shown. However, despite all this controversy, in
1994 NIST reaffirmed DES for government use for a further five years for use in areas
other than classified.
omit them (although strictly speaking these are not DES as they do not adhere to the standard).
2.2.1 Overall structure
Figure 2.2 shows the sequence of events that occur during an encryption operation.
DES performs an initial permutation on the entire 64 bit block of data. It is then split
into 2, 32 bit sub-blocks, Li and Ri which are then passed into what is known as a
round (see figure 2.3), of which there are 16 (the subscript i in Li and Ri indicates
the current round). Each of the rounds are identical and the effects of increasing their
number is twofold - the algorithms security is increased and its temporal efficiency
decreased. Clearly these are two conflicting outcomes and a compromise must be
made. For DES the number chosen was 16, probably to guarantee the elimination of
any correlation between the ciphertext and either the plaintext or key6. At the end of the
16th round, the 32 bit Li and Ri output quantities are swapped to create what is known
as the pre-output. This [R16, L16] concatenation is permuted using a function which
is the exact inverse of the initial permutation. The output of this final permutation is
the 64 bit ciphertext.