Uni Processor Architecture

1,721 views 24 slides Aug 16, 2022
Slide 1
Slide 1 of 24
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24

About This Presentation

Distributed Computing Processor Architecture, Part -1. This is useful for CTEVT Diploma in Computer Engineering Students.


Slide Content

Distributed Computing EG 3113 CT Diploma in Computer Engineering 5 th Semester Unit 2.1 Uni -Processor Architecture Lecture by : Er . Ashish K.C(Khatri)

Uniprocessor Architecture: A uniprocessor is a system with a single processor which has three major components: CPU: - set of general purpose registers along with the program counter. - a special purpose CPU status registers for storing the current state of CPU and program under execution. - One ALU and one local cache memory. Main memory Input/Output system In addition, there is a common synchronous bus architecture for communication between CPU, Main memory and I/O system 6/1/2021 Distributed Computing Notes © Er. Ashish K.C(Khatri) 2

6/1/2021 Distributed Computing Notes © Er. Ashish K.C(Khatri) 3

Example: 6/1/2021 Distributed Computing Notes © Er. Ashish K.C(Khatri) 4

Parallel processing mechanism: Parallelism in a uniprocessor  means a system with a single processor performing two or more than two tasks simultaneously . Parallelism can be achieved by two means hardware and software. Parallelism increases efficiency and reduces the time of processing. 6/1/2021 Distributed Computing Notes © Er. Ashish K.C(Khatri) 5

Hardware approach : Multiplicity of functional units Parallelism and pipelining within the CPU Overlapped CPU and I/O operations Use of a hierarchical memory system Balancing of subsystem bandwidths 6/1/2021 Distributed Computing Notes © Er. Ashish K.C(Khatri) 6

Multiplicity of functional units: In earlier computers, the CPU consists of only one arithmetic logic unit which used to perform only one function at a time. This slows down the execution of the long sequence of arithmetic instructions. To overcome this the functional units of the CPU can be increased to perform parallel and simultaneous arithmetic operations. 6/1/2021 Distributed Computing Notes © Er. Ashish K.C(Khatri) 7

Parallelism and Pipelining within CPU Parallel adders can be implemented using techniques such as carry-look ahead and carry-save. A parallel adder is a digital circuit that adds two binary numbers, where the length of one bit is larger as compared to the length of another bit and the adder operates on equivalent pairs of bits parallelly . The multiplier can be recoded to eliminate more complex calculations . Various instruction execution phases are pipelined and to overcome the situation of overlapped instruction execution the techniques like instruction pre-fetch and data buffers are used. 6/1/2021 Distributed Computing Notes © Er. Ashish K.C(Khatri) 8

Overlapped CPU and I/O Operation To execute I/O operation parallel to the CPU operation we can use I/O controllers or I/O processors. For direct information transfer between the I/O device and the main memory, direct memory access (DMA) can be used. 6/1/2021 Distributed Computing Notes © Er. Ashish K.C(Khatri) 9

Use Hierarchical Memory System We all are aware of the fact that the processing speed of the CPU is 1000 times faster than the memory accessing speed which results in slowing the processing speed. To overcome this speed gap hierarchical memory system can be used. The faster accessible memory structure is registered in CPU, then cache memory which buffers the data between CPU and main memory. 6/1/2021 Distributed Computing Notes © Er. Ashish K.C(Khatri) 10

 Balancing of Subsystem Bandwidth The processing and accessing time of CPU, main memory, and I/O devices are different. If arrange the processing time of these units in descending order the order would be : t d > t m > t p where t d  is the processing time of the device, t m  is the processing time of the main memory, and t p  is the processing time of the central processing unit. The processing time of the I/O devices is greater as compared to the main memory and processing unit. CPU is the fastest unit . To put a balance between the speed of CPU and memory a fast cache memory can be used which buffers the information between memory and CPU. To balance the bandwidth between memory and I/O devices, input-output channels with different speeds can be used between main memory and I/O devices. 6/1/2021 Distributed Computing Notes © Er. Ashish K.C(Khatri) 11

Software approach for parallelism: Multi programming Time sharing 6/1/2021 Distributed Computing Notes © Er. Ashish K.C(Khatri) 12

Multi programming: There may be multiple processes active in computers and some of them may be competing for memory, some for I/O devices and some for CPU. So , to establish a balance between these processes, program interleaving must be practiced. This will boost resource utilization by overlapping the I/O and CPU operations. Program interleaving can be understood as when a process P 1  is engaged in I/O operation the process scheduler can switch CPU to operate process P 2 . This led process P 1  and P 2  to execute simultaneously. This interleaving between CPU and I/O devices is called multiprogramming. 6/1/2021 Distributed Computing Notes © Er. Ashish K.C(Khatri) 13

Time-sharing: Multiprogramming is based on the concept of time-sharing. The CPU time is shared among multiple programs. Sometimes a high-priority program can engage the CPU for a long period starving the other processes in the computer. The concept of timesharing assigns a fixed or variable time slice of CPUs to multiple processes in the computer. This provides an equal opportunity to all the processes in the computer. 6/1/2021 Distributed Computing Notes © Er. Ashish K.C(Khatri) 14

CISC architecture: Complex Instruction Set Computer. This processor includes a huge collection of simple to complex instructions. These instructions are specified in the level of assembly language level and the execution of these instructions takes more time. A complex instruction set computer is a computer where single instructions can perform numerous low-level operations like a load from memory, an arithmetic operation, and a memory store or are accomplished by multi-step processes or addressing modes in single instructions, as its name proposes “Complex Instruction Set ”. So, this processor moves to decrease the number of instructions on every program & ignore the number of cycles for each instruction.  6/1/2021 Distributed Computing Notes © Er. Ashish K.C(Khatri) 15

CISC computers have small programs. It has a huge number of compound instructions, which takes a long time to perform. Here , a single set of instructions is protected in several steps; each instruction set has additional than 300 separate instructions. Maximum instructions are finished in two to ten machine cycles. In CISC, instruction pipelining is not easily implemented. 6/1/2021 Distributed Computing Notes © Er. Ashish K.C(Khatri) 16

Characteristics of CISC: Complex instruction, hence complex instruction decoding.  Instructions are larger than one-word size.   Instruction may take more than a single clock cycle to get executed .   Less number of general-purpose registers as operation get performed in memory itself.   Complex Addressing Modes.   More Data types .  6/1/2021 Distributed Computing Notes © Er. Ashish K.C(Khatri) 17

Disadvantages of CISC: The existing instructions used by the CISC are 20% within a program event. As compared with the RISC processor, CISC processors are very slow while executing every instruction cycle on every program. This processor use number of transistors as compared with RISC. The pipeline execution within the CISC will make it difficult to use. The machine performance reduces because of the low speed of the clock. 6/1/2021 Distributed Computing Notes © Er. Ashish K.C(Khatri) 18

RISC architecture: “R educed Instruction Set Computer’’. It is a CPU design plan based on simple orders and acts fast . This is a small or reduced set of instructions. Here , every instruction is expected to attain very small jobs. In this machine, the instruction sets are modest and simple, which help in comprising more complex commands. Each instruction is of a similar length; these are wound together to get compound tasks done in a single operation. Most commands are completed in one machine cycle. This pipelining is a crucial technique used to speed up RISC machines. 6/1/2021 Distributed Computing Notes © Er. Ashish K.C(Khatri) 19

Characteristics of RISC: Simpler instruction, hence simple instruction decoding.   Instruction comes undersize of one word.   Instruction takes a single clock cycle to get executed.   More number of general-purpose registers.   Simple Addressing Modes.   Less Data types.   Pipeline can be achieved. 6/1/2021 Distributed Computing Notes © Er. Ashish K.C(Khatri) 20

Disadvantages of RISC: The performance of this processor may change based on the executed code because the next commands may depend on the earlier instruction for their implementation within a cycle. The complex instruction is frequently used by the compilers and programmers These processors need very quick memory to keep different instructions that use a huge collection of cache memory to react to the command within less time. 6/1/2021 Distributed Computing Notes © Er. Ashish K.C(Khatri) 21

6/1/2021 Distributed Computing Notes © Er. Ashish K.C(Khatri) 22

6/1/2021 Distributed Computing Notes © Er. Ashish K.C(Khatri) 23

Differences between RISC and CISC: 6/1/2021 Distributed Computing Notes © Er. Ashish K.C(Khatri) 24