Computer Organization Unit-I Osmania University

Computer Organization Subject Code- PC 402 CS UNIT I Prepared by M.Sowmya Assistant professor Department of CSE

Course Objectives: To provide in depth knowledge to the students about the design and organization of a digital computer, operation of various functional units, instruction set design and factors that influence the performance of a computer. To enable the students with the understanding of basic computer architecture with instruction set and programming of 8085 in particular. To learn the functionality and interfacing of various peripheral devices. Course Outcomes: After the completion of the course, the student will be able to: To understand the architecture of modern computer, Bus structures. Analyze the Different memories and evaluate the mapping techniques. Discuss the architecture, the instruction set and addressing modes of 8085 processor. Analyze Stacks, Subroutine, Interrupts of 8085, different PPI techniques, the uses of interfaces 8259, RS 232C, USART (8251), and DMA controller. Design the applications of interfacing circuit‘s 8254/8253timer, A/D and D/A converter, Keyboard/Display controller. Co urse Objectives And Outcomes

Syllabus Unit I Basic Structure of Computers: Computer Types, Functional Units, Basic Operational Concepts, Bus Structures, Performance, Multiprocessors and Multicomputers, Historical perspective. Input/output Organization: Accessing I/O devices, Interrupts, Processor examples, Direct memory access, parallel interface and serial interface. Unit II The Memory System: Basic concepts, Semiconductor RAM memories,Read-Only memories,Speed, Size and Cost, Cachememories,Performance considerations, Virtual Memories, Memorymanagement requirements, Secondary Storage. Unit III 8085 Architecture: Introduction to microprocessors and microcontrollers, 8085 Processor Architecture, Internal operations, Instructions and timings. Programming the 8085 - Introduction to 8085 instructions, Addressing modes and Programming techniques with Additional instructions.

Unit IV Stacks and subroutines, interfacing peripherals - Basic interfacing concepts, interfacing output displays, Interfacing input keyboards. Interrupts - 8085 Interrupts, Programmable Interrupt Controller (8259A). Direct Memory Access (DMA) - DMA Controller (Intel 8257), Interfacing 8085 with Digital to Analog and Analog to Digital converters. Unit V Programmable peripheral interface (Intel 8255A), Programmable communication interface (Intel 8251), Programmable. Interval timer (Intel 8253 and 8254), Programmable Keyboard /Display controller (Intel 8279). Serial and parallel bus standards RS 232 C, IEEE 488. Suggested Readings: 1. Carl Hamacher, Zvonko Vranesic, SafwatZaky, Computer Organization, 5th Edition, McGraw Hill, 2002. 2. Ramesh S Gaonkar, Microprocessor Architecture, Programming, and Applications with the 8085, 5/E Prentice Hall,2002. 3. Pal Chouduri, Computer Organization and Design, Prentice Hall of India,1994. 4. M. M. Mano, Computer System Architecture, 3rd Edition, PrenticeHall.

Unit I Syllabus Basic Structure of Computers: Computer Types, Functional Units, Basic Operational Concepts, Bus Structures, Performance, Multiprocessors and Multi computers, Historical perspective. Input/output Organization: Accessing I/O devices, Interrupts, Processor examples, Direct memory access, parallel interface and serial interface.

Early Comput ing Devices People used sticks, stones as counting tools before computers were invented. More computing devices were produced as technology advanced and the human intelligence improved over time. Some of them are: Abacus Napier’s Bone Pascaline Stepped Reckoner or Leibneiz wheel Difference Engine Analytical Engine Tabulating machine Differential Analyzer Mark I

History Of Computer In 1822, the father of computers, Charles Babbage began developing what would be the first mechanical computer. And then in 1833 he actually designed an Analytical Engine which was a general-purpose computer. It contained an ALU, some basic flow chart principles and the concept of integrated memory. Then more than a century later in the history of computers, we got our first electronic computer for general purpose. It was the ENIAC, which stands for Electronic Numerical Integrator and Computer. The inventors of this computer were John W. Mauchly and J.Presper Eckert. And with times the technology developed and the computers got smaller and the processing got faster. We got our first laptop in 1981 and it was introduced by Adam Osborne and EPSON.

Revolution Of Computer Discrete Components: Transistors, Resistors, and Capacitors Integrated Circuits (ICs): These components are manufactured from Silicon Wafers (1960) By 1970 it was possible to implement a complete microprocessor on a single chip (1 st Evolution) Moor’s law- doubling the number of transistors on a chip every 1.5 to 2 years By 1980 , Personal Computer came into the picture (2 nd Evolution) By 1990 , the Internet could provide connectivity between computers (3 rd evolution-digital Evolution) By 2000 , Smartphones and tablets were invented. (4 th Evolution)

Classification Of Computer

Types Of Computer Super Computers Mainframe Computers Minicomputers Workstation Microcomputers/ Personal Computers

Super Computer Supercomputers are designed such that they can process a huge amount of data, like processing trillions of instructions or data just in a second. This is because of the thousands of interconnected processors in supercomputers. It is basically used in scientific and engineering applications such as weather forecasting, scientific simulations, and nuclear energy research. It was first developed by Roger Cray in 1976. Characteristics of Super Computer: They are the fastest and very expensive It can calculate up to ten trillion individual calculations per second , this also the reason which makes it even faster. It is used in the stock market or big organizations for managing the online currency world such as bitcoin etc. It is used in scientific research areas for analyzing data obtained from exploring the solar system, satellites, etc .

Mainframe Computer Mainframe computers are designed in such a way that they can support hundreds or thousands of users at the same time. It also supports multiple programs simultaneously. So, they can execute different processes simultaneously. It is used for big organizations like banking, telecom sectors, etc., which process a high volume of data in general. Characteristics of Mainframe Computer: It is also an expensive or costly computer. It has high storage capacity and great performance. It can process a huge amount of data (like data involved in the banking sector) very quickly. It runs smoothly for a long time and has a long life.

Mini Computer Mini Computer, as the name suggests, is a type of computer that offers most features and capabilities that a large computer generally offers. It generally supports multiple users at a time so one can say that it is a multiprocessing system. It is a smaller computer designed for business applications and services, and also can do time-sharing, batch processing, online processing, etc.

Work station A workstation is designed for technical or scientific applications. It consists of a fast microprocessor, with a large amount of RAM and a high-speed graphic adapter. It is a single-user computer. It is generally used to perform a specific task with great accuracy. Characteristics: It is expensive or high in cost. They are exclusively made for complex work purposes. It provides large storage capacity, better graphics, and a more powerful CPU when compared to a PC. It is also used to handle animation, data analysis, CAD, audio and video creation, and editing.

Micro Computer It is basically a general-purpose computer designed for individual use. It consists of a microprocessor as a central processing unit(CPU), memory, input unit, and output unit. This kind of computer is suitable for personal work such as making an assignment, watching a movie, or at the office for office work, etc. For example, Laptops and desktop computers. Characteristics of PC In this, a limited number of software can be used. It is the smallest in size. It is designed for personal use. It is easy to use.

Differences between Micro and Mini Computer Micro computer Mini computer It is a personal computer introduced in 1970 and used for general purpose. It is small computer introduced in 1960 and used for operating business and scientific applications. These computers are used by people for education and entertainment. These computers are used by companies for manufacturing control of process. It is composed of single processing optimization. It is composed of double-processingand a optimization. It uses single microprocessor for single-processing purposes slow, processes CPU that performs all logic and arithmetic operations. It uses Multiple processors. Storage capacity is in terms of Gigabyte (GB). Storage capacity is in terms of Terabyte (TB). They are primarily used for word processing, managing databases or spreadsheets, graphics and general office applications. They are primarily used for process control, performing financial and administrative tasks, such as word processing and accounting. It is more cost effective and easy to use as compared to a a minicomputer. It is more costly and difficult to use as compared to microcomputers.

Analog Computer It is particularly designed to process analogue data. Continuous data that changes continuously and cannot have discrete values are called analogue data. So, an analogue computer is used where we don’t need exact values or need approximate values such as speed, temperature, pressure etc. It can directly accept the data from the measuring device without first converting it into numbers and codes. It measures the continuous changes in physical quantity. It gives output as a reading on a dial or scale. For example speedometers, mercury thermometers, etc.

Digital Computer Digital computers are designed in such a way that they can easily perform calculations and logical operations at high speed. It takes raw data as input and processes it with programs stored in its memory to produce the final output. It only understands the binary input 0 and 1, so the raw input data is converted to 0 and 1 by the computer and then it is processed by the computer to produce the result or final output. All modern computers, like laptops, desktops including smartphones are digital computers.

Hybrid Computer As the name suggests hybrid, which means made by combining two different things. Similarly, the hybrid computer is a combination of both analog and digital computers. Hybrid computers are fast like analog computers and have memory, and accuracy like digital computers. So, it has the ability to process both continuous and discrete data. For working when it accepts analog signals as input then it converts them into digital form before processing the input data. So, it is widely used in specialized applications where both analog and digital data are required to be processed. A processor which is used in petrol pumps that converts the measurements of fuel flow into quantity and price is an example of a hybrid computer.

Image s of Various Computer

Functional Units of Computer

Functional Units of Computer: Description A computer consists of five functionally independent main parts: Input: The input unit accepts coded information from human operators using devices such as keyboards, or from other computers over digital communication lines. Many other kinds of input devices for human-computer interaction are available, including the touchpad, mouse, joystick, and trackball. These are often used as graphic input devices in conjunction with displays. Microphones can be used to capture audio input which is then sampled and converted into digital codes for storage and processing. Similarly, cameras can be used to capture video input. Memory: Primary Memory Secondary Memory Cache Memory Arithmetic and logic: The information received is stored in the computer’s memory, either for later use or to be processed immediately by the arithmetic and logic unit. Any arithmetic or logic operation, such as addition, subtraction, multiplication, division, or comparison of numbers, is initiated by bringing the required operands into the processor, where the operation is performed by the ALU. Output units: Finally, the results are sent back to the outside world through the output unit. All of these actions are coordinated by the control unit. An interconnection network provides the means for the functional units to exchange information and coordinate their actions. Control Unit: The operation of input, output, and memory units must be coordinated in some way. This is the responsibility of the control unit.

Computer Memory

P Primary memory is also called internal memory. This is the main area in a computer where data, instructions, and information are stored. Any storage location in this memory can be directly accessed by the Central Processing Unit. As the CPU can randomly access any storage location in this memory, it is also called Random Access Memory or RAM. The CPU can access data from RAM as long as the computer is switched on. As soon as the power to the computer is switched off, the stored data and instructions disappear from RAM. Such a type of memory is known as volatile memory. RAM is also called read/write memory. Read-Only Memory (ROM) is a type of primary memory from which information can only be read. So it is also known as Read-Only Memory. ROM can be directly accessed by the Central Processing Unit. But, the data and instructions stored in ROM are retained even when the computer is switched off OR we can say it holds the data after being switched off. Such type of memory is known as non-volatile memory.

Secondary Memory

Primary memory is volatile and has limited capacity. So, it is important to have another form of memory that has a larger storage capacity and from which data and programs are not lost when the computer is turned off. Such a type of memory is called secondary memory. In secondary memory, programs and data are stored. It is also called auxiliary memory. It is different from primary memory as it is not directly accessible through the CPU and is non-volatile . Secondary or external storage devices have a much larger storage capacity and the cost of secondary memory is less as compared to primary memory. Secondary Memory Characteristics : These are magnetic and optical memories. It is a type of non-volatile memory. Data is permanently stored even when the computer is turned off. It helps store data on a computer. The computer can function without secondary memory. Slower than primary memory. Examples: magnetic tapes, optical discs, floppy disks, flash memory [USB drives], paper tape, punched cards, etc.

Cache Memory Cache memory also called cache, supplementary memory system that temporarily stores frequently used instructions and data for quicker processing by the central processing unit (CPU) of a computer . The cache augments and is an extension of, a computer's main memory.

The Processor Cache The processor and relatively small cache memory can be fabricated on a single integrated circuit chip. Speed Cost Memory management

Difference between Primary memory and Secondary memory Comparison Parameters Primary Memory Secondary Memory Storage validity Primary memory is the main memory and stores data temporarily. Secondary memory is the external memory and stores data permanently. Access The CPU can directly access the data. The CPU cannot directly access the data. Volatility Primary memory is volatile. It loses data in case of a power outage. Secondary memory is non-volatile; data is stored even during a power failure. Storage Data is stored inside costly semiconductor chips. Data is stored on external hardware devices like hard drives, floppy disks, etc. Division It can be divided into They do not have such a classification. Secondary memories are permanent storage devices like CDs, DVDs, etc. Speed Faster Slower Stored data It saves the data that the computer is currently using. It can save various types of data in various formats and huge sizes.

Arithmetic Logic Unit Opcode is the first part of an instruction that tells the computer what function to perform . In the computer system, ALU is a main component of the central processing unit, which stands for arithmetic logic unit and performs arithmetic and logic operations. It is also known as an integer unit (IU) which is an integrated circuit within a CPU or GPU, which is the last component to perform calculations in the processor. It has the ability to perform all processes related to arithmetic and logic operations such as addition, subtraction, and shifting operations, including Boolean comparisons (XOR, OR, AND, and NOT operations). Also, binary numbers can accomplish mathematical and bitwise operations. The arithmetic logic unit is split into AU (arithmetic unit) and LU (logic unit). The operands and code used by the ALU tell it which operations have to perform according to input data. When the ALU completes the processing of input, the information is sent to the computer's memory.

Configuration of Arithmetic Logic Unit The description of how ALU interacts with the processor is given below. Every arithmetic logic unit includes the following configurations: Instruction Set Architecture: The Instruction Set Architecture must be more in length for storing three operands, 1 destination, and 2 sources. Accumulator: The intermediate result of every operation is contained by the accumulator, which means Instruction Set Architecture (ISA) is not more complex because there is only required to hold one bit. Stack: Whenever the latest operations are performed, these are stored on the stack that holds programs in top-down order, which is a small register. When the new programs are added to execute, they push to put the old programs. Register to Register: It includes a place for 1 destination instruction and 2 source instructions, also known as a 3-register operation machine. Register Stack: Generally, the combination of Register and Accumulator operations is known as for Register - Stack Architecture. The operations that need to be performed in the register-stack Architecture are pushed onto the top of the stack. And its results are held at the top of the stack. Register Memory: one operand comes from the register, and the other comes from the external memory. Generally, this technology is integrated with Register-Registe r technology and practically cannot be used separately.

Control Unit Control circuits are responsible for generating the timing signals that govern the transfers and determine when a given action is to take place. Data transfers between the processor and the memory are also managed by the control unit through timing signals. It is reasonable to think of a control unit as a well-defined, physically separate unit that interacts with other parts of the computer. Much of the control circuitry is physically distributed throughout the computer. A large set of control lines (wires) carries the signals used for the timing and synchronization of events in all units.

Summary of Computer Operation The operation of a computer can be summarized as follows: • The computer accepts information in the form of programs and data through an input unit and stores it in the memory • Information stored in the memory is fetched under program control into an arithmetic and logic unit, where it is processed. • Processed information leaves the computer through an output unit. • All activities in the computer are directed by the control unit.

Example

Memory Locations and Addressing Memory consists of many millions of storage cells, each of which can store a bit of information having the value 0 or 1. Because a single bit represents a very small amount of information, bits are seldom handled individually. The usual approach is to deal with them in groups of fixed size. Modern computers have word lengths that typically range from 16 to 64 bits If the word length of a computer is 32 bits, a single word can store a 32-bit signed number or four ASCII-encoded characters, each occupying 8 bits. For example, a 24-bit address generates an address space of 2 24 (16,777,216) locations.

Memory Locations and Addressing

Example of Encoded bit Information ASCII: American Standard Code for Information Exchange

Basic Operational Concepts To perform a given task, an appropriate program consisting of a list of instructions is stored in the memory. Individual instructions are brought from the memory into the processor , which executes the specified operations. Data to be used as instruction operands are also stored in the memory. Example: A typical instruction might be Load R2, LOC (Address Location) This instruction reads the contents of a memory location whose address is represented symbolically by the label LOC and loads them into processor register R2.

Basic Operational Concepts cont.. The original contents of location LOC are preserved, whereas those of register R2 are overwritten. Execution of this instruction requires several steps. First, the instruction is fetched from the memory into the processor. Next, the operation to be performed is determined by the control unit. The operand at LOC is then fetched from the memory into the processor. Finally, the operand is stored in register R2 . After operands have been loaded from memory into processor registers, arithmetic or logic operations can be performed on them.

Example The instruction Add R4, R2, R3 adds the contents of registers R2 and R3, then places their sum into register R4. The operands in R2 and R3 are not altered, but the previous value in R4 is overwritten by the sum. After completing the desired operations, the results are in processor registers. They can be transferred to the memory using instructions such as Store R4, and LOC. This instruction copies the operand in register R4 to memory location LOC. The original contents of location LOC are overwritten, but those of R4 are preserved. For Load and Store instructions, transfers between the memory and the processor are initiated by sending the address of the desired memory location to the memory unit and asserting the appropriate control signals. The data are then transferred to or from the memory.

Connection between the Processor and Main Memory

Connection between the Processor and Main Memory cont.. In addition to the ALU and the control circuitry, the processor contains a number of registers used for several different purposes. The instruction register (IR) holds the instruction that is currently being executed. Its output is available to the control circuits, which generate the timing signals that control the various processing elements involved in executing the instruction. The program counter (PC) is another specialized register . It contains the memory address of the next instruction to be fetched and executed. During the execution of an instruction, the contents of the PC are updated to correspond to the address of the next instruction to be executed. PC points to the next instruction that is to be fetched from the memory. In addition to the IR and PC, a few general-purpose registers R0 through Rn−1 are often called processor registers. The MAR(Memory Address Register ) holds the address of the memory location to be accessed. The MDR(Memory Data Register) contains the data to be written into or read out of addressed location. They serve a variety of functions, including holding operands that have been loaded from the memory for processing.

Number Representation and Arithmetic Operations We need to represent both positive and negative numbers. Three systems are used for representing such numbers: • Sign-and-magnitude • 1’s-complement • 2’s-complement In all three systems, the leftmost bit is 0 for positive numbers and 1 for negative numbers

Modular Number Systems and 2’s Complement System

Bus Structure A bus is a group of lines that serves as a connecting path for several devices. Bus must have lines for data transfer, address and control purposes. Because the bus can be used for only one transfer at a time, only 2 units can actively use the bus at any given time. Bus control lines are used to arbitrate multiple requests for use of the bus. Main advantage of single bus is low cost and flexibility for the attaching of peripheral devices.

Single Bus Structure

Connections among bus and I/O devices

Data lines/Address lines/ Control lines Data lines: Data lines coordinate in transferring the data among the system components. The data lines are collectively called data buses. A data bus may have 32 lines, 64 lines, 128 lines, or even more lines. The number of lines present in the data bus defines the width of the data bus. Each data line is able to transfer only one bit at a time. So the number of data lines in a data bus determines how many bits it can transfer at a time. The performance of the system also depends on the width of the data bus. Address Lines: The content of the address lines of the bus determines the source or destination of the data present on the data bus. The number of address lines together is referred to as the address bus . The number of address lines in the address bus determines its width . The width of the address bus determines the memory capacity of the system. Control Lines: The address lines and data lines are shared by all the components of the system so there must some means to control the use and access of data and address lines. The control signals placed on the control lines control the use and access to the address and data lines of the bus. The control signal consists of the command and timing information. Here the command in the control signal specifies the operation that has to be performed. And the timing information over the control signals specifies when the data and address information is valid.

Issues and Challenges Speed issue Different devices have different transfer/operate speed. If the speed of the bus is bounded by the slowest device connected to it,the efficiency will be very low. How to solve this? A common approach- use buffers. Ex: Printing the characters

Bus Operation A bus requires a set of rules, often called a bus protocol , that governs how the bus is used by various devices. The bus protocol determines when a device may place information on the bus , when it may load the data on the bus into one of its registers, and so on. These rules are implemented by control signals that indicate what and when actions are to be taken. One control line, usually labeled R/W, specifies whether a Read or a Write operation is to be performed. As the label suggests, it specifies Read when set to 1 and Write when set to 0. When several data sizes are possible, such as byte, halfword, or word, the required size is indicated by other control lines. The bus control lines also carry timing information. They specify the times at which the processor and the I/O devices may place data on or receive data from the data lines. A variety of schemes have been devised for the timing of data transfers over a bus. These can be broadly classified as either synchronous or asynchronous schemes.

Synchronous Bus On a synchronous bus, all devices derive timing information from a control line called the bus clock, shown in Figure. The signal on this line has two phases: a high level followed by a low level. The two phases constitute a clock cycle. The first half of the cycle between the low-to-high and high-to-low transitions is often referred to as a clock pulse. The address and data lines shown in Fig are shown as if they are carrying both high and low signal levels at the same time. The crossing points indicate the times at which these pattern changes.

Instructions for Read Operation Let us consider the sequence of signal events during an input (Read) operation. At time t0, the master places the device address on the address lines and sends a command on the control lines indicating a Read operation. The command may also specify the length of the operand to be read. Information travels over the bus at a speed determined by its physical and electrical characteristics. The clock pulse width, t1 − t0, must be longer than the maximum propagation delay over the bus. Also, it must be long enough to allow all devices to decode the address and control signals so that the addressed device (the slave) can respond at time t1 by placing the requested input data on the data lines. At the end of the clock cycle, at time t2, the master loads the data on the data lines into one of its registers. To be loaded correctly into a register, data must be available for a period greater than the setup time of the register. Hence, the period t2 − t1 must be greater than the maximum propagation time on the bus plus the setup time of the master’s register. A similar procedure is followed for a Write operation .

Detailed Timing Diagram of Previous Operation

I/P Data Transfer Using Multiple Clock Cycles

I/P Data Transfer Using Multiple Clock Cycles During clock cycle 1, the master sends address and command information on the bus, requesting a Read operation. The slave receives this information and decodes it. It begins to access the requested data on the active edge of the clock at the beginning of clock cycle 2. We have assumed that due to the delay involved in getting the data, the slave cannot respond immediately. The data become ready and are placed on the bus during clock cycle 3. The slave asserts a control signal called Slave-ready at the same time. The master, which has been waiting for this signal, loads the data into its register at the end of the clock cycle. The slave removes its data signals from the bus and returns its Slave-ready signal to the low level at the end of cycle 3. The bus transfer operation is now complete, and the master may send a new address and command signals to start a new transfer in clock cycle 4. The Slave-ready signal is an acknowledgment from the slave to the master, confirming that the requested data have been placed on the bus. It also allows the duration of a bus transfer to change from one device to another.

Asynchronous Operation An alternative scheme for controlling data transfers on a bus is based on the use of a handshake protocol between the master and the slave. Handshake is a command and response signal between the master and the slave. A control line called Master-ready is asserted by the master to indicate that it is ready to start a data transfer. The Slave responds by asserting Slave-ready. A data transfer controlled by a handshake protocol proceeds as follows. The master places the address and command information on the bus. Then it indicates to all devices that it has done so by activating the Master-ready line. This causes all devices to decode the address. The selected slave performs the required operation and informs the processor that it has done so by activating the Slave-ready line. The master waits for Slave-ready to become asserted before it removes its signals from the bus. In the case of a Read operation, it also loads the data into one of its registers.

Asynchronous Operation t0 — The master places the address and command information on the bus, and all devices on the bus decode this information. t1 — The master sets the Master-ready line to 1 to inform the devices that the address and command information is ready. The delay t1 − t0 is intended to allow for any skew that may occur on the bus. Skew occurs when two signals transmitted simultaneously from one source arrive at the destination at different times. This happens because different lines of the bus may have different propagation speeds. Thus to guarantee that the Master-ready signal does not arrive at any device ahead of the address and command information, the delay t1 − t0 should be longer than the maximum possible bus skew. t2 — The selected slave , having decoded the address and command information, performs the required input operation by placing its data on the data lines. At the same time, it sets the Slave-ready signal to 1. If extra delays are introduced by the interface circuitry before it places the data on the bus, the slave must delay the Slave-ready signal accordingly. The period t2 − t1 depends on the distance between the master and the slave and on the delays introduced by the slave’s circuitry. t3 — The Slave-ready signal arrives at the master, indicating that the input data are available on the bus . The master must allow for bus skew. It must also allow for the setup time needed by its register. After a delay equivalent to the maximum bus skew and the minimum setup time, the master loads the data into its register. Then, it drops the Master-ready signal, indicating that it has received the data. t4 — The master removes the address and command information from the bus. The delay between t3 and t4 is again intended to allow for bus skew. Erroneous addressing may take place if the address, as seen by some device on the bus, starts to change while the Master-ready signal is still equal to 1. t5 — When the device interface receives the 1-to-0 transition of the Master-ready signal, it removes the data and the Slave-ready signal from the bus. This completes the input transfer.

Bus Arbitration There are occasions when two or more entities contend for the use of a single resource in a computer system. For example, two devices may need to access a given slave at the same time. In such cases, it is necessary to decide which device will access the slave first. The decision is usually made in an arbitration process performed by an arbiter circuit. The arbitration process starts with each device sending a request to use the shared resource. The arbiter associates priorities with individual requests. If it receives two requests at the same time, it grants the use of the slave to the device having the higher priority first. BR1: Bus Request line and BG1: Bus Grand line

Granting the Bus based on Priorities

Performance The most important measure of a Computer is how quickly it can execute programs. Three factors affect the performance Hardware design Instruction Set Compiler

Processor Clock

Performance The performance of a computer depends on the constituent subsystems of the system including software. Each of the subsystems can be measured and tuned for performance. Thus performance can be measured for: CPU performance for Scientific applications, Vector processing, Business application, etc. – Instructions per Second Graphics performance – Rendering- Pixels per second I/O Performance – Transactions Per Second Internet performance and more – bandwidth utilization in Mbps or Gbps

Basic Performance Equation Performance = 1/Execution Time (Time taken to execute a program) CPU Time in seconds (TCPU) = Number of Instructions in the program / average number of instructions executed per second by the CPU OR Number of Instructions in the program x Average clock cycles per instructions x time per clock cycle. This is written rhythmically as below. CPU Time = Number of Instructions in the program (N) x Average clock cycles per instructions (CPI) x time per clock cycle (Tclk) = N x CPI x Tc N is the number of machine instructions. CPI is Cycles Per Instruction rather average Cycles per Instruction required by the CPU. Time per Clock Cycle: T = 1/f

Example Let us use an example to reinforce our learning on CPU performance. A program ABCD has 15000 instructions executed on a system whose clock frequency is 3.3Ghz and the design facilitates average Cycles per instruction of 12. Calculate the CPU time utilized to execute Program ABCD. Here, N = 15000, CPI = 12, Tclk = 1/3.3Ghz Tclk = 1/3.3Ghz = 1/(3.3 x 10^9) = 0.3 x 10^-9 Therefore, CPU Execution Time TCPU = N x CPI x Tclk = 15000 x 12 x 0.3 x 10^-9 seconds = 54000 x 10^-9 seconds = 54 x 10^-6 seconds = 54 microseconds

Parallelism Performance can be enhanced by performing a number of operations in parallel. Parallelism can be implemented on many different levels Instruction Level Parallelism The simplest way to execute a sequence of instructions in a processor is to complete all steps of the current instruction before starting the steps of the next instruction. If we overlap the execution of the steps of successive instructions, the total execution time will be reduced. For example, the next instruction could be fetched from memory at the same time that an arithmetic operation is being performed on the register operands of the current instruction. This form of parallelism is called pipelining.

Pipeline and Superscale Operation

Multicore processor Multiple processing units can be fabricated on a single chip. In technical literature, the term core is used for each of these processors. The term processor is then used for the complete chip. Hence, we have the terminology dual-core, quad-core, and octo-core processors for chips that have two, four, and eight cores, respectively .

Multiprocessor Computer systems may contain many processors, each possibly containing multiple cores. Such systems are called multiprocessors. These systems either execute a number of different application tasks in parallel or execute subtasks of a single large task in parallel. All processors usually have access to all of the memory in such systems, and the term shared-memory multiprocessor is often used to make this clear. The high performance of these systems comes with much higher complexity and cost, arising from the use of multiple processors and memory units, along with more complex interconnection networks.

Multi computer A multicomputer is a system that contains numerous processors that work together to solve an issue. Every processor has its memory, which is solely accessible by that processor. An interconnection network allows the processors to communicate with one another. As the multicomputer can transmit messages between the processors, the task may be categorized among the CPUs to be completed. Therefore, the multicomputer may be utilized for distributed computation. A multicomputer is easier and less expensive to develop than a multiprocessor.

Multiprocessor Vs. Multicomputer Note: The term generally refers to an architecture in which each processor has its own memory rather than multiple processors with a shared memory

Historical perspective The history of the computer dates back to several years. There are four generations of computer. Each generation has witnessed several technologies advances which change the functionality of computers. This results in more compact,powerful,robust systems which are less expensive.

First generation(1940-1956) Programs and their data were located in the same memory, as they are today. This facilitates changing existing programs and data or preparing and loading new programs and data. Assembly language was used to prepare programs and was translated into machine language for execution. Basic arithmetic operations were performed in a few milliseconds, using vacuum tube technology to implement logic functions. The memory was upto 4000 bits. Ex: ENIAC,UNIVACTBM 701.

Second generation(1956-1963) Several advancements in the first generation computers led to the development of second generation computers. The transistor was invented at AT&T Bell Laboratories in the late 1940s and quickly re_x0002_placed the vacuum tube in implementing logic functions. Magnetic core memories and magnetic drum storage devices were widely used in the second generation. Compilers were developed to translate these high-level language programs into assembly language, which was then translated into executable machine-language form. It has batch operating system and it is faster and smaller in size. The memory capacity was 32,000 bitsand the input was provided through punch cards. Ex:Honeywell 400,CDC 1604,IBM 7030.

Third generation(1964-1971) Texas Instruments and Fairchild Semiconductor developed the ability to fabricate many transistors on a single silicon chip, called integrated-circuit technology. This enabled faster and less costly processors and memory elements to be built. Integrated-circuit memories began to replace magnetic core memories. Other developments included the introduction of microprogramming, parallelism, and pipelining. Cache memory makes the main memory appear faster than it really is,and virtual memory makes it appear larger. The capacity of the memory was 1,28,000 bits. The input was provided through keyboard and monitor. Ex.IBM 360/370,CDC 6600,PDP 8/11.

Fourth generation(1971- To Present) Integrated circuits made from semiconductor materials were used in this generation. Number of transistors could be placed on a single chip and the name invented as Very Large Scale Integration(VLSI) technology. A complete processor fabricated on a single chip became known as a microprocessor. Companies such as Intel, National Semiconductor, Motorola, Texas Instruments, and Advanced Micro Devices have been the driving forces of this technology. A particular form of VLSI technology, called Field Programmable Gate Arrays (FPGAs), has allowed system developers to design and implement processor, memory, and I/O circuits on a single chip to meet the requirements of specific applications. It supports object oriented high level programs and it is small and easy to use hand held computers. The capacity of the memory was 100 million bits. The input was provided through improved hand held devices,key board and mouse. Parallelism and hierarchical memories have evolved to produce the high performance computing system. Ex: Apple II,VAX 9000,CRAY 1(Super Computers).

Accessing I/O Devices (Memory Mapped I/O) Each I/O device must appear to the processor as consisting of some addressable locations, just like the memory. Some addresses in the address space of the processor are assigned to these I/O locations, rather than to the main memory. These locations are usually implemented as bit storage circuits (flip-flops) organized in the form of registers . It is customary to refer to them as I/O registers. Since the I/O devices and the memory share the same address space , this arrangement is called memory-mapped I/O. It is used in most computers. With memory-mapped I/O, any machine instruction that can access memory can be used to transfer data to or from an I/O device. For example, if DATAIN is the address of a register in an input device. The instruction Load R2, DATAIN reads the data from the DATA IN register and loads them into processor register R2. Similarly, the instruction Store R2, DATA OUT sends the contents of register R2 to location DATAOUT, which is a register in an output device.

Programmed Controlled I/O Consider a task that reads characters typed on a keyboard, stores these data in the memory, and displays the same characters on a display screen. A simple way of implementing this task is to write a program that performs all functions needed to realize the desired action. This method is known as program-controlled I/O. For output , a character must be sent to the display only when the display device is able to accept it. The rate of data transfer from the keyboard to a computer is limited by the typing speed of the user, which is unlikely to exceed a few characters per second. The rate of output transfers from the computer to the display is much higher. It is determined by the rate at which characters can be transmitted to and displayed on the display device , typically several thousand characters per second. However, this is still much slower than the speed of a processor that can execute billions of instructions per second. The difference in speed between the processor and I/O devices creates the need for mechanisms to synchronize the transfer of data between them.

I/O Device Interface

I/O Device Interface An I/O device is connected to the interconnection network by using a circuit, called the device interface, which provides the means for data transfer and for the exchange of status and control information needed to facilitate the data transfers and govern the operation of the device. The interface includes some registers that can be accessed by the processor. One register may serve as a buffer for data transfers, another may hold information about the current status of the device, and yet another may store the information that controls the operational behavior of the device. These data, status, and control registers are accessed by program instructions as if they were in memory locations. Typical transfers of information are between I/O registers and the registers in the processor .

Registers in the Keyboard and Display Interface

Registers in the Keyboard and Display Interface The keyboard includes a circuit that responds to a key being pressed by producing the code for the corresponding character that can be used by the computer. We will assume that ASCII code is used, in which each character code occupies one byte. Let KBD_DATA be the address label of an 8-bit register that holds the generated character. Also, let a signal indicating that a key has been pressed be provided by setting to 1 a flip-flop called KIN, which is a part of an eight-bit status register, KBD_STATUS. The processor can read the status flag KIN to determine when a character code has been placed in KBD_DATA . When the processor reads the status flag to determine its state, we say that the processor polls the I/O device. The display includes an 8-bit register, which we will call DISP_DATA, used to receive characters from the processor. It also must be able to indicate that it is ready to receive the next character; this can be done by using a status flag called DOUT, which is one bit in a status register, DISP_STATUS.

Example: Input Process Let us consider the details of the input process. When a key is pressed, the keyboard circuit places the ASCII-encoded character into the KBD_DATA register. At the same time, the circuit sets the KIN flag to 1. Meanwhile, the processor is executing the I/O program which continuously checks the state of the KIN flag. When it detects that KIN is set to 1, it transfers the contents of KBD_DATA into a processor register. Once the contents of KBD_DATA are read, KIN must be cleared to 0, which is usually done automatically by the interface circuit. If a second character is entered at the keyboard, KIN is again set to 1 and the process repeats. READWAIT Read the KIN flag Branch to READWAIT if KIN = 0 Transfer data from KBD_DATA to R5 which reads the character into processor register R5

Example: Output Display An analogous process takes place when characters are transferred from the processor to the display. When DOUT is equal to 1, the display is ready to receive a character. Under program control, the processor monitors DOUT, and when DOUT is equal to 1, the processor transfers an ASCII-encoded character to DISP_DATA. The transfer of a character to DISP_DATA clears DOUT to 0. When the display device is ready to receive a second character, DOUT is again set to 1. This can be achieved by performing the operations: WRITEWAIT Read the DOUT flag Branch to WRITEWAIT if DOUT = 0 Transfer data from R5 to DISP_DATA

Example: Output Display Cont … The wait loop is executed repeatedly until the status flag DOUT is set to 1 by the display when it is free to receive a character. Then, the character from R5 is transferred to DISP_DATA to be displayed, which also clears DOUT to 0. We assume that the initial state of KIN is 0 and the initial state of DOUT is 1. This initialization is normally performed by the device control circuits when power is turned on. In computers that use memory-mapped I/O, in which some addresses are used to refer to registers in I/O interfaces, data can be transferred between these registers and the processor using instructions such as Load, Store, and Move.

Functions of Interface Circuits The I/O interface of a device consists of the circuitry needed to connect that device to the bus. On one side of the interface are the bus lines for address, data, and control. On the other side are the connections needed to transfer data between the interface and the I/O device. This side is called a port, and it can be either a parallel or a serial port. A parallel port transfers multiple bits of data simultaneously to or from the device. A serial port sends and receives data one bit at a time. Communication with the processor is the same for both formats; the conversion from a parallel to a serial format and vice versa takes place inside the interface circuit.

Summary of I/O Interface Circuits Provides a register for the temporary storage of data Includes a status register containing status information that can be accessed by the processor Includes a control register that holds the information governing the behavior of the interface Contains address-decoding circuitry to determine when it is being addressed by the processor and generates the required timing signals Performs any format conversion that may be necessary to transfer data between the processor and the I/O device, such as parallel-to-serial conversion in the case of a serial port.

Parallel Interface The parallel interface refers to a multiline channel,with each line capable of transmitting several bits of data simultaneously. First, we describe an interface circuit for an 8-bit input port that can be used for connecting a simple input device, such as a keyboard. Then, we describe an interface circuit for an 8-bit output port, which can be used with an output device such as a display. We assume that these interface circuits are connected to a 32-bit processor that uses memory-mapped I/O and the asynchronous bus protocol.

Keyboard-to-Processor Connection

Keyboard-to-Processor Connection The output of the encoder consists of one byte of data representing the encoded character and one control signal called Valid. When a key is pressed, the Valid signal changes from 0 to 1, causing the ASCII code of the corresponding character to be loaded into the KBD_DATA register and the status flag KIN to be set to 1. The status flag is cleared to 0 when the processor reads the contents of the KBD_DATA register. The interface circuit is shown connected to an asynchronous bus on which transfers are controlled by the handshake signals Master-ready and Slave-ready The bus has one other control line, R/W, which indicates a Read operation when equal to 1.

An Input Interface Circuit

Explanation on Input Interface Circuit In ref to the previous Fig. there are two addressable locations in this interface, KBD_DATA and KBD_STATUS. They occupy adjacent word locations in the address space. The keyboard status flag, KIN. When the status register is read by the processor, all other bit locations appear as containing zeros. When the processor requests a Read operation, it places the address of the appropriate register on the address lines of the bus. The address decoder in the interface circuit examines bits A31−3, and asserts its output, My-address, when one of the two registers KBD_DATA or KBD_STATUS is being addressed. Bit A2 determines which of the two registers is involved. Hence, a multiplexer is used to select the register to be connected to the bus based on address bit A2. The two least-significant address bits, A1 and A0, are not used, because we have assumed that all addresses are word-aligned. The output of the multiplexer is connected to the data lines of the bus through a set of tri-state gates. The interface circuit turns the tri-state gates on only when the three signals Master-ready, My_address, and R/W are all equal to 1, indicating a Read operation. Slave-ready signal is asserted at the same time, to inform the processor that the requested data or status information has been placed on the data lines. When address bit A2 is equal to 0, Read-data is also asserted. This signal is used to reset the KIN.

Internal Circuit for the Status Flag Block

Circuit for the Status Flag Block The KIN flag is the output of a NOR latch connected as shown in the previous Fig. A flip-flop is set to 1 by the rising edge on the Valid signal line. This event changes the state of the NOR latch to set KIN to 1, but only when Master-ready is low. The reason for this additional condition is to ensure that KIN does not change state while being read by the processor. Both the flip-flop and the latch are reset to 0 when Read-data becomes equal to 1, indicating that KBD_DATA is being read.

Output Interface (Display to Processor Connection)

Output Interface (Display to Processor Connection) The display uses two handshake signals, New-data and Ready , in a manner similar to the handshake between the bus signals Master-ready and Slave-ready. When the display is ready to accept a character, it asserts its Ready signal, which causes the DOUT flag in the DISP_STATUS registers to be set to 1 . When the I/O routine checks DOUT and finds it equal to 1, it sends a character to DISP_DATA . This clears the DOUT flag to 0 and sets the New-data signal to 1. In response, the display returns Ready to 0 and accepts and displays the character in DISP_DATA. When it is ready to receive another character, it asserts Ready again, and the cycle repeats.

Serial Interface A serial interface is used to connect the processor to I/O devices that transmit data one bit at a time. Data are transferred in a bit-serial fashion on the device side and in a bit-parallel fashion on the processor side. The transformation between the parallel and serial formats is achieved with shift registers t hat have parallel access capability. The input shift register accepts bit-serial input from the I/O device. When all 8 bits of data have been received, the contents of this shift register are loaded in parallel into the DATAIN register. Similarly, output data in the DATAOUT register are transferred to the output shift register, from which the bits are shifted out and sent to the I/O device. There are two basic approaches. The first is known as asynchronous transmission because the receiver uses a clock that is not synchronized with the transmitter clock . In the second approach, the receiver is able to generate a clock that is synchronized with the transmitter clock. Hence it is called synchronous transmission.

Serial Interface

Serial Interface Cont.. A serial interface is used to connect the processor to I/O devices that transmit data one bit at a time. Data are transferred in a bit-serial fashion on the device side and in a bit-parallel fashion on the processor side. The transformation between the parallel and serial formats is achieved with shift registers that have parallel access capability. The input shift register accepts bit-serial input from the I/O device. When all 8 bits of data have been received, the contents of this shift register are loaded in parallel into the DATAIN register. Similarly, output data in the DATAOUT register are transferred to the output shift register, from which the bits are shifted out and sent to the I/O device. Two status flags, which we will refer to as SIN and SOUT, are maintained by the Status and control block. The SIN flag is set to 1 when new data are loaded into DATAIN from the shift register and cleared to 0 when these data are read by the processor. The SOUT flag indicates whether the DATAOUT register is available. It is cleared to 0 when the processor writes new data into DATAOUT and set to 1 when data are transferred from DATAOUT to the output shift register. It is possible to implement DATAIN and DATAOUT themselves as shift registers, thus obviating the need for separate shift registers. However, this would impose awkward restrictions on the operation of the I/O device. After receiving one character from the serial line, the interface would not be able to start receiving the next character until the processor reads the contents of DATAIN. Thus, a pause would be needed between two characters to give the processor time to read the input data. During serial transmission, the receiver needs to know when to shift each bit into its input shift register

Asynchronous Transmission/ Start- Stop Transmission This approach uses a technique called start-stop transmission. Data are organized in small groups of 6 to 8 bits, with a well-defined beginning and end. In a typical arrangement, alphanumeric characters encoded in 8 bits are transmitted as shown in Fig. The line connecting the transmitter and the receiver is in the 1 state when idle . A character is transmitted as a 0 bit, referred to as the Start bit, followed by 8 data bits and 1 or 2 Stop bits. The Stop bits have a logic value of 1. The 1-to-0 transition at the beginning of the Start bit alerts the receiver that data transmission is about to begin. Using its own clock, the receiver determines the position of the next 8 bits, which it loads into its input register. The Stop bits following the transmitted character, which are equal to 1, ensure that the Start bit of the next character will be recognized. When transmission stops, the line remains in the 1 state until another character is transmitted.

Synchronous Transmission In the start-stop scheme described above, the position of the 1-to-0 transition at the beginning of the start bit is the key to obtaining correct timing information. In synchronous transmission, the receiver generates a clock that is synchronized to that of the transmitter by observing successive 1-to-0 and 0-to-1 transitions in the received signal. It adjusts the position of the active edge of the clock to be in the center of the bit position. A variety of encoding schemes are used to ensure that enough signal transitions occur to enable the receiver to generate a synchronized clock and to maintain synchronization. Once synchronization is achieved, data transmission can continue indefinitely.

Serial Interface VS Parallel Interface Sr. No. Parameters Serial Interface Parallel Interface 1. Main Concept One bit at a time. Multiple bits at a time. 2. Cost It requires fewer pins and is easy to set up hence the cost is comparatively less. It requires more pins and is not easy to set up hence the cost is comparatively high. 3. Transmission In serial interface bits of data are sent one at a time in a single stream of 1s and 0s along a single wire. Parallel interface can move a set of 8 bits (1s and 0s) of data in parallel at the same time along 8 separate wires meaning multiple bits of data are sent simultaneously. 4. Speed Transmission speed is Low. Transmission speed is High. 5. Capability A serial interface can transmit a single stream of data at a time. A parallel interface can transmit multiple data streams at a time. 6. Port Type A serial interface uses Male ports. A parallel interface uses Female ports. 7. Redundancy Bottom-Up model is better suited as it ensures minimum data redundancy and focus is on re-usability. Top-down model has a high ratio of redundancy as the size of the project increases. 8. Applications Modems, security cameras, device controllers, GPS Receivers, Telescopes, Power Inverters, and Flat-Screen Monitors use serial ports. Printers, Hard Drives, CD drives, Scanners, External CD-ROM, and Optical Drivers use parallel ports.

Interrupts An interrupt in computer architecture is a signal that requests the processor to suspend its current execution and service the occurred interrupt. To service the interrupt the processor executes the corresponding interrupt service routine (ISR). After the execution of the interrupt service routine, the processor resumes the execution of the suspended program. Interrupts can be of two types hardware interrupts and software interrupts.

Hardware Interrupts If a processor receives the interrupt request from an external I/O device it is termed a hardware interrupt . Hardware interrupts are further divided into maskable and non-maskable interrupts. Maskable Interrupt: The hardware interrupt that can be ignored or delayed for some time if the processor is executing a program with higher priority is termed a maskable interrupt. Non-Maskable Interrupt: The hardware interrupts that can neither be ignored nor delayed and must immediately be serviced by the processor are termed non-maskable interrupts.

Software Interrupts A software interrupt is a type of interrupt that is caused either by a special instruction in the instruction set or by an exceptional condition in the processor itself. A software interrupt often occurs when an application software terminates or when it requests the operating system for some service.

What is a Processor? The processor is a chip or a logical circuit that responds to and processes the basic instructions to drive a particular computer. The main functions of the processor are fetching, decoding, executing, and writing back the operations of an instruction . The processor is also called the brain of any system which incorporates computers, laptops, smartphones, embedded systems, etc. The ALU (Arithmetic Logic Unit) and CU (Control Unit) are the two parts of the processors. The Arithmetic Logic Unit performs all mathematical operations such as additions, multiplications, subtractions, divisions, etc. and the control unit works like traffic police, it manages the command or the operation of the instructions. The processor communicates with the other components also which are input/output devices and memory/storage devices.

Types of Processor General Purpose Processor: The General Purpose Processor (GPP): GPP is used for processing signals from input to output by controlling the operation of the system bus, address bus, and data bus. Microprocessor: A microprocessor is a computer processor where the data processing logic and control are included on a single integrated circuit or a small number of ICs. The microprocessor contains the arithmetic, logic, and control circuitry required to perform the functions of a computer's central processing unit. Microcontroller: A microcontroller is a small computer on a single VLSI integrated circuit chip. A microcontroller contains one or more CPUs along with memory and programmable input/output peripherals. Embedded Processor: An embedded processor is a type of microprocessor designed into a system to control electrical and mechanical functions. An embedded processor can be programmed specifically for the work it is intended to do. Digital Signal Processor: A digital signal processor is an IC or IP core designed to process signals digitally with efficiency . Media Processor: media processor, mostly used as an image/video processor, is a microprocessor-based system-on-a-chip that is designed to deal with digital streaming data .

What is DMA? Direct Memory Access (DMA) is a hardware device that allows I/O devices to directly access memory with less participation of the processor. DMA controller needs the same old circuits of an interface to communicate with the CPU and Input/Output devices. The DMA controller has three registers as follows. Address register – It contains the address to specify the desired location in memory. Word count register – It contains the number of words to be transferred. Control register – It specifies the transfer mode.

Block Diagram of DMA All registers in the DMA appear to the CPU as I/O interface registers. Therefore, the CPU can both read and write into the DMA registers under program control via the data bus.

Explanation of DMA The CPU initializes the DMA by sending the given information through the data bus . The starting address of the memory block where the data is available (to read) or where data are to be stored (to write). It also sends word count which is the number of words in the memory block to be read or written. Control to define the mode of transfer such as read or write. A control to begin the DMA transfer.

Summary of Unit 1 Types of Computer Functional Units of Computer Computer Memory Basic Operational Concepts Bus Structure and Protocols Bus Arbitration CPU Performance Parallelism Accessing I/O Devices Memory Mapped I/O Vs Programmed Control I/O I/O Device Interface Parallel Interface Vs. Serial Interface Interrupts DMA

Computer Organization Unit-I Osmania University

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Computer Organization Unit-I Osmania University

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Slide 14

Slide 15

Slide 16

Slide 17

Slide 18

Slide 19

Slide 20

Slide 21

Slide 22

Slide 23

Slide 24

Slide 25

Slide 26

Slide 27

Slide 28

Slide 29

Slide 30

Slide 31

Slide 32

Slide 33

Slide 34

Slide 35

Slide 36

Slide 37

Slide 38

Slide 39

Slide 40

Slide 41

Slide 42

Slide 43

Slide 44

Slide 45

Slide 46

Slide 47

Slide 48

Slide 49

Slide 50

Slide 51

Slide 52

Slide 53

Slide 54

Slide 55

Slide 56

Slide 57

Slide 58

Slide 59

Slide 60

Slide 61

Slide 62

Slide 63

Slide 64

Slide 65

Slide 66

Slide 67

Slide 68

Slide 69

Slide 70

Slide 71

Slide 72

Slide 73

Slide 74

Slide 75

Slide 76

Slide 77