Design and Performance Analysis of 8-Bit RISC Processor Using Xilinx & Microwind Tool
Design and Performance Analysis of 8-Bit RISC Processor Using Xilinx & Microwind Tool
Design and Performance Analysis of 8-Bit RISC Processor Using Xilinx & Microwind Tool
2, 2012
Design and Performance Analysis of 8-bit RISC Processor using Xilinx & Microwind Tool
R.Uma
(Research Scholar, Department of Computer Science, Pondicherry University, Pondicherry Email: [email protected])
Abstract
RISC or Reduced Instruction Set Computer is a design philosophy that has become a mainstream in Scientific and engineering applications. Increasing performance and gate capacity of recent FPGA devices permits complex logic systems to be implemented on a single programmable device. In FPGA the design is hardwired whereas in ASIC based implementation the design has the flexibility for minimizing the gate count and delay. So the main objective of this paper is to design and implement an 8-bit Reduced Instruction Set (RISC) processor using XILINX tool and microwind tool and its performance is analyzed. The important feature of this processor is very simple and support load/store architecture. The important components of this processor include the Arithmetic Logic Unit, Shifter, Rotator and Control unit. The module functionality and performance issues like area, power dissipation and propagation delay are analyzed at 90 nm process technology using SPARTAN 3E XCS500E XILINX tool for FPGA and microwind tool for ASIC design.
February Issue
Page 37 of 84
International Journal of Advances in Science and Technology, Vol. 4, No.2, 2012 selected Instruction. The architecture supports 16 instructions to support Arithmetic, Logical, Shifting and Rotational operations. The remainder of this paper is organized as follows. Section 2 explains the architecture detail of 8-bit RISC processor. Section 3 presents the design module of ALU, Control unit and general purpose registers both in FPGA and ASIC. Section 4 presents the simulation results implemented in advanced 90nm process technology and FPGA implementation. Section 5 discusses summary with the implementation of the RISC design topology. The final section presents the conclusion.
CONTROL UNIT
INSTRUCTION REGISTER
INSTRUCTION DECODER REGISTER A UNIVERSAL SHIFT REGISTER A L U ACCUMULATOR BARALLEL SHIFT REGISTER
REGISTER B
The control unit reads the opcode and instruction bits and then creates control signals as outputs that triggers the respective components and data path to perform the desired task. The control unit has two instruction decoders that decodes the instruction bits and the decoded output of the control unit is fed as control signal either into Arithmetic logic unit (ALU) or Universal shifter or Barrel shift rotator. The operands are received from register A and register B by the ALU. Depending on the control signal from the control unit the ALU performs either arithmetic or logic operations. After the execution of the instruction, the result is stored in the accumulator register. Input is taken from source register A and is either loaded or shifted in right or left direction based on the control lines activated by the control unit. The shifted data is saved in the destination register which is nothing but the accumulator register. Input data is given from source register A and rotated N number of times based on the opcode fed from the control unit. The rotated data is stored in the accumulator register.
February Issue
Page 38 of 84
d 0 1 0 0 0 0 0 0 0 Z 0 1 0 0 0 0 0 0 0
d 1 0 1 0 0 0 0 0 0 Z 1 0 1 0 0 0 0 0 0
d 2 0 0 1 0 0 0 0 0 Z 2 0 0 1 0 0 0 0 0
d 3 0 0 0 1 0 0 0 0 Z 3 0 0 0 1 0 0 0 0
d 4 0 0 0 0 1 0 0 0 Z 4 0 0 0 0 1 0 0 0
d 5 0 0 0 0 0 1 0 0 Z 5 0 0 0 0 0 1 0 0
0 0 0 0 0 0
0 0 0 0 0 1 0 1 0 1 1 0 1 0 0 1 0 1 0 1 1 0 0 1 1 1 S3 S2 S1 S0 1 0 0 0 1 0 0 1 1 0 1 0 0 0 1 1 1 1 0 0 1 1 0 1 1 1 1 0 1 1 1 1
NOT NO CHANGE SHIFT-RIGHT SHIFT-LEFT ROTATE 1-BIT ROTATE 3-BIT ROTATE 5-BIT ROTATE 7-BIT
February Issue
Page 39 of 84
B. Arithmetic Logic Unit The ALU design comprises of 2 units. One unit is meant for logic operation containing eight bit logic gates such as AND,NAND,OR,NOR,XOR,XNOR and the other unit is meant for arithmetic operations such as ADD and SUBTRACT. In arithmetic unit, based on the control input Cin the Add and Subtract operations take place. For Cin low, addition of the given input data is performed whereas for Cin high subtraction performed. The entire design of the ALU in FPGA and ASIC is represented in Figure (5a & 5b) and the internal submodule of arithmetic unit is shown in Figure (6a & 6b) and the simulated timing waveform for arithmetic unit and CLA using microwind and xilinx tool is shown in Figures (7 and 7.1 ) .
Figure 5a. Top block of 8 bit arithmetic and logic unit in FPGA
Figure 5b. Top block of 8 bit arithmetic and logic unit in ASIC
February Issue
Page 40 of 84
Figure 7.1 Simulated timing diagram of carry look ahead in FPGA and ASIC
C. Universal Shift Register The Universal shift register is designed with features such as loading, right shift, left shift and no change. The design has eight 4x1 multiplexers and nine basic gates and is shown in the Figure (8) for FPGA and ASIC. Loading the input is attained by applying eight bits of data as input with control lines S0 and S1 taken as low. Right shifting takes place for the given eight bit input data with control lines S0 high and S1 low and similarly the left shift takes place for the eight bit data as input provided the control lines S0 should be low and S1 should be high. The output remains low for the control lines S0 and S1 taken high. The entire operation is represented in Table (2) and Figure (9) shows the simulated result of the universal shift register in FPGA and ASIC.
Figure 8. Top block of universal shift register in FPGA & ASIC Table 2.Operation of the universal shift register
SELECT LINES INPUT A 7 1 1 1 1 1 1 1 1 AA 6 5 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 A 4 1 1 1 1 1 1 1 1 A 3 0 0 0 0 0 0 0 0 A A 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 A 0 0 0 0 0 0 0 0 0 Ci n 0 1 0 1 0 1 0 1 S1S0 0 0 00 01 01 10 10 11 11 OPERATION PERFORMED OUTPUT QQQQ 7 6 5 4 1 1 1 1 1 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 QQQQ 3 2 1 0 0 0 0 0 0 0 0 0 1 0 0 0 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0
Cout
0 0 1 1 0 0 0 0
Load Load Right shift Right shift Left shift Left shift No change No change
Figure 9. Simulated timing diagram of universal shift register in FPGA & ASIC
February Issue
Page 41 of 84
International Journal of Advances in Science and Technology, Vol. 4, No.2, 2012 D. Barrel Shift Rotator The design consists of a total of eight 8x1 multiplexers. The output of one multiplexer is connected as input to the next multiplexer in such a way that the input data gets shifted in each multiplexer thus performing the rotation operation. Depending on the select lines the number of rotation varies. With select lines low there is no output. If select line S0 is high 1-bit rotation takes place, if S1 is high 2-bit roation takes place and the roation continues untill all select lines are high. The rotation of the input data for different select lines is shown in Table (3) and simulated timing diagram in FPGA and ASIC is shown in Figure (11). The Figure (10) shows the top block of the barrel shift rotator in FPGA and ASIC.
Table 3. Operations of Barrel rotator
INPUT OF ROTATOR AAAAAAAA 7 6 5 4 3 2 1 0 1 1 1 1 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1 0 0 0 0 S2 S1 S0 0 0 0 0 1 1 1 1 0 0 1 1 0 0 1 1 0 1 0 1 0 1 0 1 OUTPUT OF ROTATOR Q7 Q6 Q5 Q4 Q3 Q2 Q1 Q0 0 1 1 1 0 0 0 0 0 1 1 0 0 0 0 1 0 1 0 0 0 0 1 1 0 0 0 0 0 1 1 1 0 0 0 0 1 1 1 1 0 0 0 1 1 1 1 0 0 0 1 1 1 1 0 0 0 1 1 1 1 0 0 0 Zero 1 Bit Rotate 2 Bit Rotate 3 Bit Rotate 4 Bit Rotate 5 Bit Rotate 6 Bit Rotate 7 Bit Rotate
FUNCTION PERFORMED
Figure 11. Timing Diagram of Barrel shift rotator in FPGA and ASIC
E. General Purpose Register The eight bit input data is stored in this register. This register acts as a source register. It consists of eight D flip flops and eight AND gates. The gate level view of the register is given by Figure (12). Initially the RESET is set high to clear the register. Taking RESET as low and CLOCK as low or high and READ as high the data is stored in the register. The condition for which the data is stored in the register is clearly shown in Table (4) and simulated timing waveform in Figure (13). Table 4.Operations of general purpose register
INPUT
CLK RESET RD Q7 Q6 Q5 Q4 Q3 Q2 Q1 Q0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 OUTPUT Q7 Q6 Q5 Q4 Q3 Q2 Q1 Q0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1
0 1 0
1 0 0
0 1 1
February Issue
Page 42 of 84
TOPOLOGY
10-9 W
AT2
AT
Control Unit
ALU
Decoder AND NAND NOR XOR CLA Inverter AND OR MUX MUX D-FF
6.275 5.753 5.753 5.753 5.753 7.732 6.034 6.546 7.508 6.582 7.198 6.546 77.43
8 4 4 4 4 5 4 4 14 9 16 4 80
50.2 23.012 23.012 23.012 23.012 38.66 24.136 26.184 105.11 59.238 115.16 26.184 536.92
315.0 132.3 132.3 132.3 132.3 298.9 144.6 171.4 789.1 389.9 828.9 171.4 3638
0.081 0.081 0.081 0.081 0.081 0.081 0.081 0.081 0.081 0.081 0.081 0.081
February Issue
Page 43 of 84
)
0.508 0.463 0.463 0.463 0.463 0.626 0.488 0.530 0.608 0.533 0.583 0.530 6.258
PD
International Journal of Advances in Science and Technology, Vol. 4, No.2, 2012 Table 5a. Delay Vs Area of 8-bit processor in microwind
TRANSISTOR COUNT
FALL DELAY nS 0.056 0.755 0.025 0.016 0.005 0.001 0.004 0.015 0.486 0.393 0.618 2.374
RISE DELAY nS
Control Unit
Decoder AND NAND NOR OR XOR XNOR CLA MUX MUX D-FF
0.062 0.028 0.019 0.008 0.006 0.008 0.020 0.490 0.759 0.408 0.628 2.436
2.6854 3.7224 7.5252 8.2032 11.899 10.699 1.8804 6.2088 15.734 15.098 17.992 101.64
ALU
F. The Instruction set format rule The instruction set of the RISC processor has been designed following several rules: All instructions are executed in just one clock cycle. Doing so, processor is simpler, smaller, faster and easier to understand. The instruction code is received at the beginning of each cycle, all operations are executed during the clock period, and results are stored at the end of it. ALU operations take two operands from registers and store the result in one of them. External read and write operations are synchronous.
4. Result
The performance of the RISC processor has been evaluated in this research work by using XILINX and Microwind tool. The design meets the need of high performance logic solution for high volume, very low cost, consumer-oriented applications. The RISC processor designed in Xilinx tool employs a multi-voltage, multi-standard SelectIO interface pins with a voltage range of 3.3V,2.5V,1.8V,1.5V and 1.2V at a 622+ Mb/s data transfer rate. It is operated at a maximum frequency range of 5MHz to 300MHz. The microwind tool integrates traditionally separated front-end and back-end chip design into an integrated flow, accelerating the design cycle and reduced design complexities. It tightly integrates mixed-signal implementation with digital implementation, circuit simulation, transistor level extraction and verification. The performance of the RISC processor using microwind tool is implemented with 0.12m CMOS technology. The simulations are carried out at conditions VDD = 1.2 V, I/O supply voltage = 2.5 V and at a room temperature of 27oC and the device model as empirical level 3 and Monte-Carlo with the MOSFET model parameter for each module as given below
*n-Mos Model *low leakage Model N1 NMOS level = 3 VTO =0.40 UO = 600.000 TOX = 2.0E-9 +LD = 0.000 THETA = 0.500 GAMMA = 0.400 +PHI =0.200 KAPPA = 0.060 VMAX = 120.00K +CGSO = 100.0p CGDO =100.0 +CGBO = 60.0p CJSW = 240.0P *p-Mos Model *low leakage
February Issue
Page 44 of 84
POWER DISSIPAION mW
TOPOLOGY
SUB BLOCKS
The overall design of 8-bit processor is shown in Figure 14. The simulation of overall execution of RISC processor is shown in Figure 15. The processor has two eight-bit input signals A7 - A0 and B7 B0 taken externally and loaded into registers A and B respectively. Memory Interface Signal is a signal READ (RD). This signal indicates that the selected memory location is to be read and data is to be put on the data bus. The synchronization of various operation are done using CLK signal. The processor is designed with two control signals RD and RESET. If reset is high then the processor will not perform any operation it will stay in idle state. If the reset is low and RD is high then the data is loaded into the data bus and its corresponding values are loaded into the general purpose registers A and B. Depending on the opcode provided by the control unit the particular operation is performed as stated in Table(1). This 8- bit RISC processor works on one clock cycles. clk is the external clock which is always equal to one which triggers the inputs and gives us the desired output. RD triggers the state of the registers through which data is passed into the internal registers A and B. I0 to I3 specifies the opcode to enable the operation. For example if the opcode value is 0111 then the operation performed will be addition.
4. Summary
This section presents the overall performance of the 8 bit RISC processor obtained from the Xilinx and microwind tool. Table (6) presents performance comparison of the designed processor in terms of delay, area and power dissipation.
Table 6. Overall performance of 8 bit RISC in FPGA & ASIC Delay(ns) Area Power RISC dissipation 77.43 80(slices) 6.258W FPGA 5.39 2954(gates) 101.64mW ASIC
It is observed that the overall delay of the processor is 77.43ns in FPGA and 5.39ns in ASIC. The overall power dissipation of this processor is observed to be 6.258 W in FPGA and 101.258mW in ASIC. The power dissipation can even be reduced if the circuit is designed with any adiabatic logic.
4. Conclusion
An 8-bit RISC processor with 16 instruction set has been designed. Every instruction is executed in one clock cycles with 3-stage pipelining. The design is verified through exhaustive simulations. The performance analysis is compared with Xilinx and microwind tool. This processor can be used as a systolic core to perform mathematical computations like solving polynomial and differential equations. Apart from that this can be used in portable gaming kits.
REFERENCES
[1] Samiappa Sakthikumaran,S.Salivahanan and V.S.Kaanchana Bhaaskaran , June 2011, 16-Bit RISC Processor Design For Convolution Application,IEEE International Conference on Recent Trends In Information Technology, pp.394-397. [2] Rohit Sharma, Vivek Kumar Sehgal, Nitin Nitin1, Pranav Bhasker, Ishita Verma , 2009, Design And Implementation Of 64- Bit RISC Processor Using VHDL,UKSim : 11th International Conference on Computer Modeling And Simulation, pp. 568 573.
February Issue
Page 45 of 84
International Journal of Advances in Science and Technology, Vol. 4, No.2, 2012 [3]Rupali S. Balpande and Rashmi S. Keote.2011, Design of FPGA based Instruction Fetch & Decode Module of 32-bit RISC (MIPS) Processor, International Conference on Communication Systems and Network Technologies pp. 409 413 [4]Sivarama P.Dandamudi ,A Guide To RISC Processor For Programmers And Engineers, Springer. [5]Tom Wada, Small Risc Processor (SPR) design specification v1.0, 12th Design Contest In OKINAWA, pp. 1-17 [6]Seung PyoJung, Jingzhe Xu, Donghoon Lee, Ju Sung Park, 2008, Design And Verification Of 16 Bit RISC Processor , International SOC Design Conference. 7] Xiaoping Huang,Xiaoya Fan, Shengbing Zhang , 2008,Design and Performance Analysis of One 32-bit Dual Issue RISC Processor for Embedded Application. [8]R. N. Noyce and M. E. Hoff, A History of Microprocessor Development at Intel, IEEE Micro, vol.1, no.1, 1981, pp.8-21. [9] J.L.Hennessy, "VLSI Processor Architecture," IEEE Trans. Computers, vol. C-33, no. 12, Dec. 1984, pp. 1221-1246. [10] John L. Hennessy, and David A. Patterson, Computer Architecture A Quantitative Approach, 4th Edition ; 2006. [11].Vincent P. Heuring, and Harry F. Jordan, Computer Systems Design and Architecture, 2nd Edition, 2003. [12].Wayne Wolf, FPGA-Based System Design , Prentice Hall, 2005.
AUTHOR PROFILE She is graduated B.E (EEE) from Bharathiyar University Coimbatore in the year 1998, Post graduated in M.E (VLSI Design) from Anna University Chennai in the year 2004. Currently she has been working as Assistant Professor in Electronics and Communication Engineering, Rajiv Gandhi College of Engineering and Technology, Puducherry. She has been teaching VLSI Design, Embedded Systems, Microprocessor and Microcontrollers for PG and UG students. She authored books on VLSI Design. She has published several papers on national conference and symposium. She is the guest faculty for Pondicherry University for M.Tech Electronics. she has been actively guiding PG and UG students in the area of VLSI, Embedded and image processing. She has received the best teacher award for the year 2006 and 2007. Her research interests are Analog VLSI Design, Low power VLSI Design, Testing of VLSI Circuits, Embedded systems and Image processing. She is a member of ISTE.
February Issue
Page 46 of 84