Computer Architecture
|
Thus a pipelined processor has a pipeline containing a number of stages (4 or 5 was a common number in early RISC processors) arranged so that a new instruction is latched into its input register as the results calculated in this stage are latched into the input register of the following stage. This means that there will be a number of instructions (equal to the number of pipeline stages in the best case) "active" in the processor at any one time.
In a typical early RISC processor (eg MIPS R3000), there would be four pipeline stages
|
![]() i1, i2, ... are successive instructions in the instruction stream. |
![]() |
We can achieve the high throughput only if every instruction takes
exactly one cycle to execute. Although most arithmetic instructions
in a RISC machine complete in a single cycle, some complex instructions
(eg divide) take more than a cycle. Such an instruction remains in
one of the stages (eg EX) for more than a cycle and creates
a pipeline bubble ahead of it in the
pipeline.
i2 is a long-latency instruction, eg divide, taking 3 cycles in the EX stage. A "bubble" develops ahead of it in the pipeline as instructions ahead of it move on the the next stage. |
Efficient handling of branch instructions presents a significant
challenge to a computer architect. The simplest branch - an unconditional one, such as the return from a
procedure or function, requires a number of instructions behind it
in the pipeline to be squashed.
When i2 the execution stage and the new PC is being calculated, instructions i3 and i4 have already entered the pipeline and must be squashed. This creates a series of bubbles in the pipeline until the target of the branch instruction, ia and its successors can be fetched from memory. The diagram shows the cost of branches fairly dramatically - a large number of bubbles are present for quite a few cycles. This diagram also makes an optimistic assumption - that the new instruction stream could be accessed within 3 cycles (not realistic if the new instructions have to come from memory rather than cache!). Branches occur very frequently in typical programs, of the order of one every 10 instructions! This shows the importance of branch prediction strategies which we will examine later. |
![]() |
Thus instead of emitting this: | mult $4, $2, $1 add $3, $4, $5 ret sub $4, $6, $7 | ||
the compiler would emit: | mult $4, $2, $1 add $3, $4, $5 ret or $1, $1, $1 sub $4, $6, $7 |
|
The compiler is asked to move an instruction that must be executed
which precedes the branch into the branch delay slot where it will be
executed while the branch target is being fetched. Current RISC machines
will have a one-instruction branch delay slot - occasionally two.
This is an example of the modern trend in computer architecture - to expose more details of the underlying machine to the compiler and let it generate the most efficient code. In this case, it is almost trivial for the compiler to find an instruction before the branch which must be executed. Compilers are rapidly becoming more capable and much more complex instruction reordering operations are routinely performed by optimising compilers. We will see more examples of this movement of responsibility for execution efficiency from the hardware to the software later. |
![]() |
start: addi $3, $3, 4 ; increment address lw $2, 0($3) ; load next array element add $1, $1, $2 ; add to total subu $5, $5, 1 ; decrement count bne start ; if not zero, branch backwhich sums the values in an array. The element count is stored in register 5 and is decremented on each iteration of the loop. The loop terminates when it reaches 0. The branch tests the condition code generated by the dec operation. If it's not zero, it will branch back to the beginning of the loop. However this value is not available until the dec instruction has reached the WB stage. The branch instruction must therefore stall in the execution stage because one of its operands (the condition code) was not available when it entered this stage. This is the first example of a pipeline hazard - we shall see more in the next section.
Continue on to Pipeline Hazards | Back to the Table of Contents |