[GeeCON2024] How I learned to stop worrying and love the dark silicon apocalypse
tkowalcz
48 views
50 slides
May 15, 2024
Slide 1 of 50
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
About This Presentation
Computation is increasingly constrained by power. With each advancement in the manufacturing process, a decreasing percentage of the CPU can operate at full capacity, leading to the emergence of the term 'dark silicon'. This trend necessitates techniques that utilize chip area to optimize po...
Computation is increasingly constrained by power. With each advancement in the manufacturing process, a decreasing percentage of the CPU can operate at full capacity, leading to the emergence of the term 'dark silicon'. This trend necessitates techniques that utilize chip area to optimize power efficiency through specialized accelerators.
The presentation will outline key concepts that led to the dark silicon such as Moore’s law and breakdown of Dennard scaling, followed by an overview of current and upcoming CPU accelerators. The focus will then shift to vector units and the specifics of vector programming. Attendees will be introduced to registers, a range of vector operations, and methods to develop branchless algorithms such as sorting networks. The session will conclude with an overview of the new Java Vector API and how it was already picked up by projects to do AI inference (Llama 2) and vector search (AstraDB and Cassandra).
Size: 42.33 MB
Language: en
Added: May 15, 2024
Slides: 50 pages
Slide Content
HOW I LEARNED TO STOP
WORRYING AND LOVE THE DARK
SILICON APOCALYPSE
HTTP://SLI.DO
#GEECON
ROOM 11
How I learned to stop worrying and love
the dark silicon apocalypse
How I learned to stop worrying and love
the dark silicon apocalypse
Dennard scaling
As transistors get smaller, their power density stays
constant, so that the power use stays in proportion
with area; both voltage and current scale
(downward) with length
ROBERT H. DENNARD, IBM
S
S
CPU
CPU
x2
x2
2 EPOCHS
75% DARK AFTER 2 GENERATIONS
93% DARK AFTER 4 GENERATIONS
CPU
x2
x2
x4
x4
CPU
4 EPOCHS
MICHAEL B. TAYLOR, IS DARK SILICON USEFUL?
MORE CORES, MORE BETTER
MORE ACCELERATORS, MORE BETTER
(C) PATRICK KENNEDY, SERVETHEHOME.COM
FJCVTZS
Floating-point Javascript Convert to Signed fixed-
point, rounding toward Zero.
PCMPESTRI
Packed Compare Explicit Length Strings, Return
Index
PICTURE CREDIT TO @FRITZCHENSFRITZ, ANNOTATED BY @GPUSAREMAGIC
PICTURE CREDIT TO @FRITZCHENSFRITZ, ANNOTATED BY @GPUSAREMAGIC
Integer Integer Integer Integer Integer Integer Integer Integer
VECTOR REGISTERS
SHAPE
+
ELEMENT TYPE
=
SPECIES
SHAPE
64 bit
128 bit
256 bit
Long Long Long Long
Double Double Double Double
Vector cosine similarity
text-embedding-3-large is our new next generation
larger embedding model and creates embeddings
with up to 3072 dimensions.
PICTURE CREDIT OPENAI
Vector cosine similarity
text-embedding-3-large is our new next generation
larger embedding model and creates embeddings
with up to 3072 dimensions.
Sorting networks
BERENGER BRAMAS,
A NOVEL HYBRID QUICKSORT ALGORITHM VECTORIZED USING AVX-512 ON INTEL SKYLAKE
NEED FOR UPFRONT DESGIN OF DATA STRUCTURES AND MEMORY
LAYOUTS
COMPLICATED ALGORITHMS
NEED TO EMPLOY ELABORATE CODE TACTICS TO NOT BREAK (RE)BOXING
FAILURE TO OPTIMISE LEADS TO CATASTROPHIC DEGRADATION OF
PERFORMANCE
WILL INCUBATE UNTIL PROJECT VALHALLA BECOMES AVAILABLE
NEED TO MEASURE, MEASURE, MEASURE ON TARGET HARDWARE
FAST AND ROBUST VECTORIZED IN-PLACE SORTING OF PRIMITIVE TYPES
INTEL® INTRINSICS GUIDE
JVECTOR SIMDOPS
PERFORMANCE SPEED LIMITS
DESIGNING IN 2023 - 10 PROBLEMS TO SOLVE (JIM KELLER)
JLLAMA