Distributed (Radix) Hash Join presentation

JasonHu80 15 views 8 slides Jun 18, 2024

Slide 1 of 8

About This Presentation

distributed radix hash join

Size: 1.07 MB

Language: en

Added: Jun 18, 2024

Slides: 8 pages

Slide Content

Distributed (Radix) Hash Join

Distributed Join - partition among nodes E.g. 4 nodes Every node Keeps full S, only some part of R Every node keeps some part of S and R Splitting/partitioning the data by hashing the join key Later on each node can perform a local join Will already have all candidate tuples!

Radix (Hash) Join Core idea: partition the input tuples further, so that hash tables for these partitions fit inside CPU cache and fewer TLB misses Uses a radix tree, 2+ passes for each table (1st: histogram, 2nd: actually filling in the tuples, repeat) Does this for each table

Distributed Radix Hash Join Partitioning Phase Network Partitioning Pass data needs be transmitted over the network in parallel with the computation Local Partitioning Passes: further partitioning, optimizing for cache/TLB

Interleave sending with compute Still, first pass is needed (get the histogram) (first pass can be distributed, scatter gather to master?) Second pass: read base data & compute its partition, write/send out to partitions (Interleave reading and sending) Easier to do in RDMA, still possible with TCP socket?

What I have been doing (Naive) Hash partition among nodes on master shuffle both tables’ tuples to worker nodes Integrate with a reference radix hash join algorithm do the rest

What I should do (I think) First off, send the base tables’ tuples to other nodes (range partition instead of hash?) Distribute the radix partition Scatter gather broadcast histogram Write out partitions - interleaving: compute some send some, repeat With global histogram/pointers, no synchronization needed

Distributed (Radix) Hash Join presentation

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Distributed (Radix) Hash Join presentation

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Pray For The Peace Of Jerusalem and You Will Prosper

Don_t_Waste_Your_Life_God.....powerpoint

VILLASUR_FACTORS_TO_CONSIDER_IN_PLATING_SALAD_10-13.pdf

Fertility awareness methods for women in the society

Chapter 5 Arithmetic Functions Computer Organisation and Architecture

syakira bhasa inggris (1) (1).pptx.......