2016年8月25日 Achieving 100k Queries per Hour with Hive on Tez
About Yahoo! JAPAN 2 The Largest Portal Site in Japan 65 billon pageviews / month 2.1 billon pageviews / day
YDN Report What is YDN Report ? Report for Yahoo Display Ads. Networks Batch Reporting over Massive Dataset 13 months, 800B+ rows of data Adding 3.3B+ rows of data per day Highly Parallel Workload 100K reports per hour 3
YDN Report Query Typical Query Query is Relatively Simple Answer “How many clicks did I get last week?” 4 SELECT account, yyyymmdd , sum( total_imps ), sum ( total_click ), ... FROM table_x WHERE yyyymmdd >= xxx AND yyyymmdd < xxx AND account = xxx ... GROUP BY account, yyyymmdd , ...;
Test Setup 5
Hive Performance Recap Hive is fast : interactive response ORC columnar file format Cost based optimizer (CBO) Vectorized SQL engine Tez execution engine (replacing MapReduce ) Hive 0.10 Batch Processing 100-150x Query Speedup Hive 1.2 Human Interactive (5 seconds)
Hive on Tez Query Execution A query execution essentially is put together from Client execution [ 0s if done correctly ] Optimization [HiveServer2] [~ 0.1s] Metadata lookups [ Hcatalog , Metastore ] [ very fast in hive 0.14 ] Application Master creation [4-5s] Container Allocation [3-5s] Tez task execution on YARN YARN and HDFS HiveServer2 Server #1 Client Running testing tool N connections N connections Metastore Metastore DB HiveServer2 Server #2 Tez AM Tez Container Tez Container …
Mini Test Mini Setup Tested 50 nodes 450B rows dataset Achieved 15K queries per hour So, can we get 100K qph on 700 nodes? We thought it should be easy, but … 8
The Bottlenecks at Scale Challenges at Scale Hive M etastore S erver YARN Resource Manager Datanode Hotspot YARN ATS 9
Hive Metastore Server 10 Use Local Metastore Before: HS2 -> Metastore Server -> Metastore DB After: HS2 (local metastore ) -> Metastore DB
Hive Metastore Server 11 Use Local Metastore Throughput: 7.6K -> 22K qph
Pending Apps YARN ResourceManager Scalability Too much pending apps 12
Pending Apps YARN ResourceManager Scalability Too much pending apps Resolve: increase yarn.resourcemanager.amlauncher.thread -count Throughput: 22K -> 26K qph 13
Pending Containers YARN ResourceManager Scalability Too much pending containers 14
Pending Containers YARN ResourceManager Scalability Too much pending containers Resolve: increase tez.am-rm.heartbeat.interval-ms.max Throughput: 26K -> 72.5K qph 15
Datanode Hotspot Last Hour Problem Connection timeout and disk access error Many queries access recently added data 16
Datanode Hotspot Last Hour Problem Resolve: Increase HDFS replication factor Throughput: 72.5K -> 95K qph 17
Other Tunings Other Tunings We Did Container reuse timeout YARN capacity scheduler node locality delay Tez shuffle keep alive TCP fin_wait Notes on YARN ATS Disabling YARN ATS gives higher throughput Trade off losing YARN log aggregation 18
End of first half 19 End of first half
Yohei Abe @ Yahoo! JAPAN Real-life Hive LLAP at Yahoo! JAPAN Aug 2016
Agenda Hive LLAP at Yahoo! JAPAN Tuning Performance Result Future Work 21
Hive LLAP at Yahoo! JAPAN
Hive on Tez Hive on Tez is able to produce 100K reports/hour 23
Hive on Tez+LLAP How Hive on Tez+LLAP handle 100K reports ? how many servers Tuning? 24
What is LLAP
What is LLAP? 26 LLAP is for sub-second query procesisng Persistent daemons Caching data
What is LLAP? 27 Tez container Tez container Tez AppMaster Tez created dynamically LLAP daemon LLAP daemon Tez AppMaster Tez+LLAP persistent daemon
Basic Tuning
LLAP test cluster 29 Server node Xeon E5-2660v2 2.2GHz / 2CPU / 128GBMEM / 10GBase-T 2port Slave node 45 nodes HiveServer2 node 10 nodes Hadoop 2.7.1 Hive 2.1.0-snapshot Tez 0.8.3
Parameters Some basic parameters needs to be changed v ery slow performance if it’s default value 30
Threading model hive.llap.daemon.num.executors 31 hive.llap.io.threadpool.size thread executor thread thread I/O thread data
Executor thread pool 32 hive.llap.daemon.num.executors (default 4) t he number of JVM thread for query execution s et this same with the num of vCPU 40 in our cpu
Performance: executor thread 33
I/O thread pool hive.llap.io.threadpool.size (default 10) n umber of IO threads Set the number of vCPU 40 in our case 34
Advanced Tuning 41 hive.llap.client.consistent.splits false(default ) => Use file locality for selecting LLAP daemon t rue => LLAP daemon is selected evenly(by hash distribution)
Recap: LLAP 42 A node runs LLAP a nd also datanode
hive.llap.client.consistent.splits 43 L ocality No Locality
Future Work
Web UI
Web UI (HIVE-11526) LLAP daemon exposes basic metrics on port 15002(default) Included in HIVE2.1 Contributed from Yahoo! JAPAN 46
Web UI (HIVE-14030) HIVE-11526 is just for each daemon HIVE-14030 provides aggregation view of a LLAP cluster (not yet in master) Contributed from Yahoo! JAPAN 47
ACL
Hive Column-level ACL 49 HS2 LLAP YARN HDFS GOAL: Column-level ACL SQL ANSWER (?): HiveServer2 can do it
Direct Access to HDFS breaks everything 50 HS2 LLAP YARN HDFS Storage Based Authorization M/R, Pig, Spark Break SQL Standard Based ACLs !! But direct accessing( Not from Hive ) to HDFS breaks the security model . Other solutions (not only Hive) are necessary
Future Directions 51 HS2 LLAP YARN HDFS LlapInputFormat M/R, Pig, Spark Check SQL Based ACLs LlapInputFormat checks ACLs to HS2 for other applications . HIVE-13441 HIVE-12991 s ee LlapDump.java
Summary
Summary 53 Throughput is greatly improved by LLAP Some tunings are necessary LLAP is also effective for batch processing