Resources Computer Architecture: A Quantitative Approach , by John L. Hennessy and David A. Patterson Other references Computer Organization and Design, by David A. Patterson and John L. Hennessy
Other Resources Slides and Notes https :// drive.google.com/drive/folders/1sFr9JbWukcJXtMvJlQDpnHS30S_uCpkr?usp=sharing Google Classroom Link https:// classroom.google.com/c/MzkwMTQzNTE0NTEy?cjc=vhsxwxv
Course Contents Fundamentals of Computer Architecture Reviewing Some Basic Concepts Instruction-Level Parallelism and Its Exploitation Limits on Instruction-Level Parallelism Multiprocessors and Thread-Level Parallelism Data Level Parallelism Request Level Parallelism Memory Hierarchy Design Storage Systems Others
Classes of Computers Personal Mobile Device (PMD) e.g. smart phones, tablet computers Emphasis on energy efficiency Desktop Computing Emphasis on price-performance Servers Emphasis on availability (very costly downtime!), scalability, throughput Clusters / Warehouse Scale Computers Used for “Software as a Service ( SaaS )”, PaaS , IaaS , etc. Emphasis on availability and price-performance Sub-class: Supercomputers, emphasis: floating-point performance and fast internal networks, and big data analytics Embedded Computers Emphasis: price
What is Computer Architecture? “Old” view of computer architecture: Instruction Set Architecture (ISA) design i.e. decisions regarding : registers, memory addressing, addressing modes, instruction operands, available operations, control flow instructions, instruction encoding “Real” computer architecture: Specific requirements of the target machine Design to maximize performance within constraints: cost, power, and availability Includes ISA, microarchitecture, hardware
Instruction Set Architecture
Abstraction An abstraction omits “unneeded” detail, helps us cope with complexity
The Instruction Set: a Critical Interface instruction set software hardware
Instruction Set Architecture A very important abstraction: interface between hardware and low-level software standardizes instructions, machine language bit patterns, etc. allows different implementations of the same architecture Modern instruction set architectures: 80x86/Pentium/K6, PowerPC, DEC Alpha, MIPS, SPARC, HP
Instruction Set Architecture Operations Memory Addressing: (Byte addressable, Aligned) Types and Sizes of Operands Addressing Modes Control Flow Instructions
Two categories of ISAs: RISC vs. CISC Reduced Instruction Set Computing Software centric approach Instructions are simple (fixed size) A reduced number of instructions Instructions on average take very few cycles (single cycle) Complex Instruction Set Computing Hardware centric approach Instructions are complex (different sizes) A large number of instructions Instructions on average take a large number of cycles
Trends in Chip Organization and Architecture
Past: The Single Processor Chip Performance scaling accomplished largely through increases in clock frequency Problems Heat Dissipation Power Consumption High Cost required for little gain in performance In 2004, Intel cancelled two single core processors because of these problems. Other vendors had already moved onto multicores at the time.
Single Processor Performance Introduction RISC Move to multi-processor
Current Trends Cannot continue to leverage Instruction-Level parallelism (ILP) Single processor performance improvement ended in 2003 New models for performance: Data-level parallelism (DLP) Thread-level parallelism (TLP) Request-level parallelism (RLP) These require explicit restructuring of the application
Present: Multicores Number of cores per chip: 2-10s Heterogeneous processors: Specialized processors Features relevant for programming are Shared and coherent caches Single Address space Non-uniform memory access (NUMA): Memory can be distributed for more scalability. memory L2 cache L1 cache L1 cache C O R E 1 C O R E 0 L2 cache
Future: Manycores ?? Massively parallel architectures Number of cores: 1000s and more Memory Multiple address spaces Non-uniform access timings On-chip addressable memory Islands of cache coherence