Contents Computer Evolution and Performance Computer organization and architecture, structure and function, Evolution (a brief history) of computers, Designing for Performance, Evolution of Intel processor architecture-n 4 bit to 64 bit, performance assessment. A top level view of Computer function and interconnection Computer Components, Computer Function, Interconnection structure, bus interconnection
Contents Computer Arithmetic The Arithmetic and Logic Unit, addition and subtraction of signed numbers, Design of adder and fast adder, carry look ahead addition, multiplication of positive numbers, signed operand multiplication, Booths algorithm, fast multiplication, integer division. Floating point representation and operations IEEE standard, arithmetic operations, guard bits and truncation.
Computer Evolution and Performance Computer organization and architecture Structure and function Evolution (a brief history) of computers Designing for Performance Evolution of Intel processor architecture-n 4 bit to 64 bit Performance assessment
Computer organization and architecture
Computer organization and architecture Computer architecture refers to those attributes of a system visible to a programmer or put another way, those attributes that have direct impact on the logical execution of a program. Architectural attributes: Instruction sets, Bits used to represent data, I/O Mechanisms Computer organization refers to the operational units and their interconnections that realize the architectural specification. Organizational Attributes: Control signals, memory technology, Interfaces between computer and peripherals.
Cont.. Computer organization refers to the operational units and their interconnection that realize the architecture specification. It’s also a Structure of von Neumann machine.
Cont.. In computer engineering, computer architecture is a set of rules and methods that describe the functionality, organization, and implementation of computer systems. Some definitions of architecture define it as describing the capabilities and programming model of a computer but not a particular implementation.
Computer Components Hardware Software I/O Modules Main memory
Computer Components Hardwired program: Process of connecting the various components in the desired combination as a form of programming. The resulting program is in the form of hardware. Software: A sequence of code or instructions. I/O Modules: Data and instruction: input to the system Reporting results: output to the system Main memory: Place to store temporarily both instructions and data.
MAR: Memory Address Register Specifies address in memory for the next read or write MBR: Memory Buffer Register Contains data to be written into memory or receives data read from memory. I/O Module: Transfer data from external device to CPU and Memory and vice versa. Contains buffer for temporary holding data. Computer Components: Top level view
I/O AR: Input/output Address Register Specifies a particular I/O device I/O BR: Input/output Buffer Register Used to exchange data between an I/O Module and the CPU. Main Memory: Set of locations-> sequentially numbered address. Each location contains binary number (Interpreted as instruction or data) Computer Components: Top level view
Computer function Instruction Fetch and Execute Interrupts Instruction cycle Multiple interrupts I/O Functions
Computer function Basic computer function: Execution of program which consist of a set of instructions stored in memory. Processor will execute instructions. Two parts of Instruction processing: 1. Instruction fetch 2. Instruction Execution Processing required for single instruction is called an instruction cycle.
Computer function: Instruction Fetch and Execute
Cont.. always increments the PC after each instruction fetch so that it will fetch the next instruction in sequence. The fetched instruction is loaded into a register in the processor known as the instruction register (IR).
Example A single instruction cycle with the following steps occurs: Fetch the ADD instruction. Read the contents of memory location A into the processor. Read the contents of memory location B into the processor. In order that the contents of A are not lost, the processor must have at least two registers for storing memory values, rather than a single accumulator. Add the two values. Write the result from the processor to memory location A.
Instruction Cycle
Instruction cycle Instruction address calculation ( iac ): Determine the address of the next instruction to be executed. Usually, this involves adding a fixed number to the address of the previous instruction. Instruction fetch (if): Read instruction from its memory location into the processor. Instruction operation decoding ( iod ): Analyze instruction to determine type of operation to be performed and operand(s) to be used.
Instruction Cycle Operand address calculation ( oac ): If the operation involves reference to an operand in memory or available via I/O, then determine the address of the operand. Operand fetch (of): Fetch the operand from memory or read it in from I/O. Data operation (do): Perform the operation indicated in the instruction. Operand store ( os ): Write the result into memory or out to I/O.
Instruction Cycle with Interrupt
Cont.. An interrupt is just that: an interruption of the normal sequence of execution.
Cont.. If no interrupts are pending, the processor proceeds to the fetch cycle and fetches the next instruction of the current program. If an interrupt is pending, the processor does the following: It suspends execution of the current program being executed and saves its context. This means saving the address of the next instruction to be executed and any other data relevant to the processor’s current activity. It sets the program counter to the starting address of an interrupt handler routine.
Instruction cycle with Interrupt
Interconnection Structures A computer consists of a set of components (CPU,memory,I/O) that communicate with each other. The collection of paths connecting the various modules is call the interconnection structure . The design of this structure will depend on the exchange that must be made between modules .
Input/Output for each module Memory N Word . . . N-1 Read Write Address Data Data
CPU CPU Interrupt Signal Data Data Instructions Control Signal
I/O module I/O Module M Ports Read Write Address Internal Data External Data Internal Data External Data Interrupt Signal
Type of transfers Memory to CPU CPU to Memory I/O to CPU CPU to I/O I/O to or from Memory (DMA)
Computer Arithmetic Arithmetic and Logical Unit
Cont..
Cont..
ALU The ALU is that part of the computer that actually performs arithmetic and logical operations on data. All of the other elements of the computer system—control unit, registers, memory, I/O—are there mainly to bring data into the ALU for it to process and then to take the results back out.
ALU The ALU is interconnected with the rest of the processor. Data are presented to the ALU in registers, and the results of an operation are stored in registers. These registers are temporary storage locations within the processor that are connected by signal paths to the ALU The ALU may also set flags as the result of an operation. The control unit provides signals that control the operation of the ALU and the movement of the data into and out of the ALU.
Booth’s Algorithm Booth's multiplication algorithm is a multiplication algorithm that multiplies two signed binary numbers in two's complement notation. The algorithm was invented by Andrew Donald Booth in 1950 while doing research on crystallography at Birkbeck College in Bloomsbury, London.
Booth’s Algorithm Decide which operand will be the multiplier and which will be the multiplicand Convert both operands to two's complement representation using X bits X must be at least one more bit than is required for the binary representation of the numerically larger operand Begin with a product that consists of the multiplier with an additional X leading zero bits
Step 1 for each pass Use the LSB (least significant bit) and the previous LSB to determine the arithmetic action . (Q0 and Q-1) Possible arithmetic actions: 00 no arithmetic operation 01 add multiplicand to left half of product 10 subtract multiplicand from left half of product 11 no arithmetic operation
Step 2 for each pass Perform an arithmetic right shift (ASR) on the entire product.
Booth’s Algorithm
Multiplication Process By Booth’s Encoding Algorithm
Multiplication Multiplication of two's-complement numbers more complicated Because performing a straightforward unsigned multiplication of the two's complement representations of the inputs does not give the correct result
Cont.. Multipliers could be designed to convert both of their inputs to positive quantities. use the sign bits of the original inputs to determine the sign of the result Increases the time required to perform a multiplication
Booth’s Algorithm A technique called Booth encoding To quickly convert two's-complement numbers into a format that is easily multiplied
Example Booth’s Encoding Example
Fast Multiplication Fast multiplication by a combination of methods 1. Bit Pair Recording of Multipliers and 2. Carry Save Addition of the Sums
Bit Pair Recording of Multipliers When Booth’s algorithm is applied to the multiplier bits before the bits are used for getting partial products─ Get fast multiplication by pairing. Multiplication will improve the performance.
Fixed Point and Floating Point Arithmetic
Division Algorithm A division algorithm is an algorithm which, given two integers dividend and Divisor, computes their quotient and/or remainder, the result of division. Division algorithms fall into two main categories: slow division fast division. Slow division algorithms produce one digit of the final quotient per iteration. Examples of slow division include restoring. non-performing restoring, non-restoring. Fast division methods start with a close approximation to the final quotient and produce twice as many digits of the final quotient on each iteration.
Restoring Division An n-bit positive divisor is loaded into register M and an n-bit positive dividend is loaded into register Q at the start of the operation. Register A is set to 0. After the division is complete, the n-bit quotient is in register Q and the remainder is in register A. The required subtractions are facilitated by using 2's-complement arithmetic. The extra bit position at the left end of both A and M accommodates the sign bit during subtractions.
Algorithm The following algorithm performs restoring division. Do the following n times: 1. Shift A and Q left one binary position. 2. Subtract M from A, and place the answer back in A. 3. If the sign of A is 1, set q0 to 0 and add M back to A (that is, restore A); otherwise, set q0 to 1.
Restoring Division Example
Flow Chart of Restoring Division
Non-Restoring Division Algorithm The restoring-division algorithm can be improved by avoiding the need for restoring a after an unsuccessful subtraction. Subtraction is said to be unsuccessful if the result is negative. Consider the sequence of operations that takes place after the subtraction operation in the preceding algorithm. If A is positive, we shift left and subtract M, that is, we perform 2A - M. If A is negative, we restore it by performing A + M, and then we shift it left and subtract M. This is equivalent to performing 2A + M. The q0 bit is appropriately set to 0 or 1 after the correct operation has been performed.
Non-Restoring Division Algorithm The following algorithm for non restoring division. Step 1: Do the following n times: 1.If the sign of A is 0, shift A and Q left one bit position and subtract M from A; otherwise, shift A and Q left and add M to A. 2. Now, if the sign of A is 0, set q0 to 1; otherwise, set q0 to 0. Step 2: If the sign of A is 1, add M to A.
Non-Restoring division Example
Non-Restoring Division Algorithm Flowchart
IEEE 754 Standard
Floating Point Number Representation for non-integral numbers Including very small and very large numbers Like scientific notation –2.34 × 1056 +0.002 × 10–4 +987.02 × 109 In binary ±1. xxxxxxx2 × 2yyyy Types float and double in C
FLOATING POINT STANDARD Defined by IEEE Std 754-1985 A way to represent very large or very small numbers precisely using scientific notation in binary form Developed in response to divergence of representations Portability issues for scientific code
Cont.. Two representations Single precision (32-bit) Double precision (64-bit)
Cont..
Arithmetic Operations on Floating-Point Numbers
Add and Subtract Rule Choose the number with smaller exponent and shift its mantissa right a number of steps equal to difference in Exponents. Set the Exponent of the result equal to the larger exponent. Perform addition/ Subtraction on the mantissa and determine the sign of the result. Normalize the resulting value if necessary .
Exceptions Overflow : This exception occurs when the result of an operation is too large to be represented as a float in its format. Underflow : This exception occurs when the result of an operation is too small to be represented as a normalized float in its format. Divide by Zero: This exception occurs when a float is divided by zero Invalid: This exception occurs when the result of an operation is ill-defined, such as (/ 0.0 0.0). Inexact : This exception occurs when the result of a floating point operation is not exact, i.e. the result was rounded.
FP Addition & Subtraction Flowchart
Multiply Rule Add the Exponent and subtract 127 Multiply the mantissa and determine the sign of the result. Normalize the resulting value if necessary.
Floating Point Multiplication
Divide Rule Subtract the exponents and add 127 Divide the mantissa and determine the sign of the result. Normalize the resulting value, if necessary.