Pipelining
Pipelining
Pipelining
Pipeline system is like the modern day assembly line setup in factories. For
example in a car manufacturing industry, huge assembly lines are setup and at each
point, there are robotic arms to perform a certain task, and then the car moves on
ahead to the next arm.
Types of Pipeline
1. Arithmetic Pipeline
2. Instruction Pipeline
Arithmetic Pipeline
Arithmetic pipelines are usually found in most of the computers. They are used for
floating point operations, multiplication of fixed point numbers etc. For example:
The input to the Floating Point Adder pipeline is:
X = A*2^a
Y = B*2^b
Registers are used for storing the intermediate results between the above
operations.
Instruction Pipeline
Pipeline Conflicts
There are some factors that cause the pipeline to deviate its normal performance.
Some of these factors are given below:
1. Timing Variations
All stages cannot take same amount of time. This problem generally occurs in
instruction processing where different instructions have different operand
requirements and thus different processing time.
2. Data Hazards
When several instructions are in partial execution, and if they reference same data
then the problem arises. We must ensure that next instruction does not attempt to
access data before the current instruction, because this will lead to incorrect results.
3. Branching
In order to fetch and execute the next instruction, we must know what that
instruction is. If the present instruction is a conditional branch, and its result will
lead us to the next instruction, then the next instruction may not be known until the
current one is processed.
4. Interrupts
Interrupts set unwanted instruction into the instruction stream. Interrupts effect the
execution of instruction.
5. Data Dependency
It arises when an instruction depends upon the result of a previous instruction but
this result is not yet available.
Advantages of Pipelining
Disadvantages of Pipelining
Scalar CPUs can manipulate one or two data items at a time, which is not very
efficient. Also, simple instructions like ADD A to B, and store into C are not
practically efficient.
Addresses are used to point to the memory location where the data to be operated
will be found, which leads to added overhead of data lookup. So until the data is
found, the CPU would be sitting ideal, which is a big performance issue.
Hence, the concept of Instruction Pipeline comes into picture, in which the
instruction passes through several sub-units in turn. These sub-units perform
various independent functions, for example: the first one decodes the instruction,
the second sub-unit fetches the data and the third sub-unit performs the math
itself. Therefore, while the data is fetched for one instruction, CPU does not sit
idle, it rather works on decoding the next instruction set, ending up working like an
assembly line.
Vector processor, not only use Instruction pipeline, but it also pipelines the data,
working on multiple data at the same time.
In vector processor a single instruction, can ask for multiple data operations, which
saves time, as instruction is decoded once, and then it keeps on operating on
different data items.
1. Petroleum exploration.
2. Medical diagnosis.
3. Data analysis.
4. Weather forecasting.
5. Aerodynamics and space flight simulations.
6. Image processing.
7. Artificial intelligence.
Superscalar Processors
It increases the throughput because the CPU can execute multiple instructions per
clock cycle. Thus, superscalar processors are much faster than scalar processors.
A scalar processor works on one or two data items, while the vector
processor works with multiple data items. A superscalar processor is a
combination of both. Each instruction processes one data item, but there are
multiple execution units within each CPU thus multiple instructions can be
processing separate data items concurrently.
While a superscalar CPU is also pipelined, there are two different performance
enhancement techniques. It is possible to have a non-pipelined superscalar CPU or
pipelined non-superscalar CPU. The superscalar technique is associated with some
characteristics, these are:
Pipelining Hazards
Whenever a pipeline has to stall due to some reason it is called pipeline hazards.
Below we have discussed four pipelining hazards.
1. Data Dependency
In the figure above, you can see that result of the Add instruction is stored in the
register R2 and we know that the final result is stored at the end of the execution of
the instruction which will happen at the clock cycle t4.
But the Sub instruction need the value of the register R2 at the cycle t3. So the Sub
instruction has to stall two clock cycles. If it doesn’t stall it will generate an
incorrect result. Thus depending of one instruction on other instruction for data
is data dependency.
2. Memory Delay
3. Branch Delay
Suppose the four instructions are pipelined I1, I2, I3, I4 in a sequence. The
instruction I1 is a branch instruction and its target instruction is Ik. Now, processing
starts and instruction I1 is fetched, decoded and the target address is computed at
the 4th stage in cycle t3.
But till then the instructions I2, I3, I4 are fetched in cycle 1, 2 & 3 before the target
branch address is computed. As I1 is found to be a branch instruction, the
instructions I2, I3, I4 has to be discarded because the instruction Ik has to be
processed next to I1. So, this delay of three cycles 1, 2, 3 is a branch delay.
Prefetching the target branch address will reduce the branch delay. Like if the
target branch is identified at the decode stage then the branch delay will reduce to 1
clock cycle.
4. Resource Limitation
If the two instructions request for accessing the same resource in the same clock
cycle, then one of the instruction has to stall and let the other instruction to use the
resource. This stalling is due to resource limitation. However, it can be prevented
by adding more hardware.
Advantages
Advantages of Pipelining:
Disadvantages of Pipelining:
Pipelining has many disadvantages though there are a lot of techniques used
by CPUs and compilers designers to overcome most of them; the following
is a list of common drawbacks:
Reservation Table: Displays the time space flow of data through the pipeline for
one function evaluation.
Time 1 2 3 4 5 6 7 8
X X X
X X
X X X
S1
Stage S2
S3
Reservation function for a function x
Latency:The number of time units (clock cycles) between two initiations of
a pipeline is the latency between them. Latency values must be non-negative
integers.
Latencies that do not cause any collision are called permissible latencies.
(E.g. in above reservation table 1, 3 and 6 are permissible latencies).