Computer Organization And Architecture: processor by nurul izzati farnanah

THE PROCESSOR: PIPELINING

Pipelining Analogy

} Pipelined laundry: overlapping execution

◦ Parallelism improves performance

n Four loads:

n Speedup
= 8/3.5 = 2.3

n Non-stop:

n Speedup
= 2n/0.5n + 1.5 ≈ 4
= number of stages

MIPS Pipeline

} Five stages, one step per stage

1. IF: Instruction fetch from memory

2. ID: Instruction decode & register read

3. EX: Execute operation or calculate address

4. MEM: Access memory operand

5. WB: Write result back to register

The first Pipeline Register (delay module) shown by the red line, IF_ID ,separates the IF stage from the next stage (ID).

We are going to break two Signals (wires): the PC+4 and the Instruction signals. We need an input for each of these, and an output that will reflect the input values at the instant of time when the clock (Event) changes from positive to negative. The VHDL code for Entity "Ifid" is created in the text file "IFID.VHD" which contains:

http://www.youtube.com/watch?v=Sk4puph6GCI

Pipeline Performance

} Assume time for stages is

◦ 100ps for register read or write

◦ 200ps for other stages

} Compare pipelined datapath with single-cycle datapath

} Refer to slide 5, the example does not reflect fourfold improvement for three instructions

◦ 2400/1400 ≈ 1.7

} Add 1,000,000 instructions, each add 200 ps to the total execution time,

◦ Total execution time = 1,000,000 x 200ps + 1400ps

= 200,001,400ps

◦ Nonpipelined total execution time

= 1,000,000 x 800ps + 2400ps

= 800,002,400 ps

◦ Speedup = 800,002,400/200,001,400

The Cortex-M3 processor has a three-stage pipeline. The pipeline stages are instruction fetch, instruction decode, and instruction execution (see Figure 6.1).

figure 6.1

Figure 6.1: The Three-Stage Pipeline in the Cortex-M3

Some people might argue that there are four stages because of the pipeline behavior in the bus interface when it accesses memory, but this stage is outside the processor, so the processor itself still has only three stages.
When running programs with mostly 16-bit instructions, you will find that the processor might not fetch instructions in every cycle.
This is because the processor fetches up to two instructions (32-bit) in one go, so after one instruction is fetched, the next one is already inside the processor. In this case, the processor bus interface may try to fetch the instruction after the next or, if the buffer is full, the bus interface could be idle.
Some of the instructions take multiple cycles to execute; in this case, the pipeline will be stalled.
In executing a branch instruction, the pipeline will be flushed.
The processor will have to fetch instructions from the branch destination to fill up the pipeline again.
However, the Cortex-M3 processor supports a number of instructions in v7-M architecture, so some of the short-distance branches can be avoided by replacing them with conditional execution codes.^[1]
Due to the pipeline nature of the processor and to ensure that the program is compatible with Thumb codes, when the program counter is read during instruction execution, the read value will...
Interested in this book and others like it? Try EngineeringPro™ from Books24x7®
http://www.youtube.com/watch?v=DxOGkwFQ8EU

By Nurul Izzati Farhanah

Move

Sunday, October 21, 2012

processor by nurul izzati farnanah

No comments:

Post a Comment