Move

Selamat Datang Ke Blog Kami

Renung-Renungkan Dan Selamat Membaca

Sunday, October 21, 2012

processor by nurul izzati farnanah


THE PROCESSOR: PIPELINING



Pipelining Analogy


}  Pipelined laundry: overlapping execution
      Parallelism improves performance






n  Four loads:
n  Speedup
= 8/3.5 = 2.3
n  Non-stop:
n  Speedup
= 2n/0.5n + 1.5
4
= number of stages



MIPS Pipeline
}  Five stages, one step per stage
1.       IF: Instruction fetch from memory
2.       ID: Instruction decode & register read
3.       EX: Execute operation or calculate address
4.       MEM: Access memory operand
5.       WB: Write result back to register

The first Pipeline Register (delay module) shown by the red line, IF_ID ,separates the IF stage from the next stage (ID).


We are going to break two Signals (wires): the PC+4 and the Instruction signals. We need an input for each of these, and an output that will reflect the input values at the instant of time when the clock (Event) changes from positive to negative. The VHDL code for Entity "Ifid" is created in the text file "IFID.VHD" which contains:




Pipeline Performance

}  Assume time for stages is
       100ps for register read or write
       200ps for other stages
}  Compare pipelined datapath with single-cycle datapath 






}  Refer to slide 5, the example does not reflect fourfold improvement for three instructions
       2400/1400 1.7
}  Add 1,000,000 instructions, each add 200 ps to the total execution time,
       Total execution time = 1,000,000 x 200ps + 1400ps
                                                                                = 200,001,400ps
       Nonpipelined total execution time
                = 1,000,000 x 800ps + 2400ps
   = 800,002,400 ps
       Speedup = 800,002,400/200,001,400


  • The Cortex-M3 processor has a three-stage pipeline. The pipeline stages are instruction fetch, instruction decode, and instruction execution (see Figure 6.1).
figure 6.1



Figure 6.1: The Three-Stage Pipeline in the Cortex-M3

  • Some people might argue that there are four stages because of the pipeline behavior in the bus interface when it accesses memory, but this stage is outside the processor, so the processor itself still has only three stages.
  • When running programs with mostly 16-bit instructions, you will find that the processor might not fetch instructions in every cycle. 
  • This is because the processor fetches up to two instructions (32-bit) in one go, so after one instruction is fetched, the next one is already inside the processor. In this case, the processor bus interface may try to fetch the instruction after the next or, if the buffer is full, the bus interface could be idle.
  •  Some of the instructions take multiple cycles to execute; in this case, the pipeline will be stalled.
  • In executing a branch instruction, the pipeline will be flushed.
  •  The processor will have to fetch instructions from the branch destination to fill up the pipeline again. 
  • However, the Cortex-M3 processor supports a number of instructions in v7-M architecture, so some of the short-distance branches can be avoided by replacing them with conditional execution codes.[1]
  • Due to the pipeline nature of the processor and to ensure that the program is compatible with Thumb codes, when the program counter is read during instruction execution, the read value will...
  • Interested in this book and others like it? Try EngineeringPro™ from Books24x7®
  • http://www.youtube.com/watch?v=DxOGkwFQ8EU


By Nurul Izzati Farhanah







No comments:

Post a Comment