Serverless Big Data Architecture on Google Cloud Platform at Credit OK

spicydog 1,448 views 31 slides Dec 02, 2018

Slide 1 of 31

About This Presentation

This is a talk at at Barcamp Bangkhen 2018,
presented by Kriangkrai Chaonithi.
I shared my experience at Credit OK on building a data pipeline to ingest huge amount of customer data to our big data analytic warehouse using serverless services on Google platform.
As a result, we can make it withou...

Size: 2.41 MB

Language: en

Added: Dec 02, 2018

Slides: 31 pages

Slide Content

Serverless Big Data Architecture
on
Google Cloud Platform
at

Presented by Kriangkrai Chaonithi @spicydog
On 25/11/2018, At Barcamp Bangkhen 9

Hello! My name is Gap

Education
●BS Applied Computer Science (KMUTT)
●MS Applied Computer Engineering (KMUTT)
Work Experience
●Former Android, iOS & PHP Developer at Longdo.COM
●Former R&D Manager at Insightera
●CTO & co-founder at Credit OK
Fields of Interests
●Software Engineering
●Computer Security
●Servers & Cloud & Distributed Computing
●Machine Learning & NLP
https://spicydog.me

Agenda

●Server & application deployment history
●Introduction to Google Cloud Platform products
○Computing
○Storage & databases
○Data analytics
●Big data architecture at Credit OK
○About Credit OK
○Why we use serverless
○Our requirements
○Our solutions
○The summary

Server & Application
Deployment History

Bare Metal Server

●Pre-cloud era (probably..)
●Install OS and dependencies on a machine
●One machine - one server
●Expose the network to the internet
●Colocation/on-premise
●SSH/FTP/Git to the server

Virtualization

●One machine - many servers
●One machine multiple customers
●VPS / Cloud
●SSH/FTP/Git to the server

IaaS

Containers & Micro Services

●Docker / Kubernetes
●Auto deployment
●Auto scale (automatic spawn new nodes)
●Pay base on number of nodes
●Infrastructure as code! (new concept!)

PaaS

Why Containers?

Why Container Orchestration?
https://blog.risingstack.com/what-is-kubernetes-how-to-get-started/

Serverless

●Write code and deploy!
●Auto deploy
●Auto scale
●Pay per request
●No infrastructure!!

SaaS

It’s time to talk about..

Some Famous Features on GCP

GCP Computing
Virtual Machine

Containers

Severless

Let’s Review Types of Databases
SQL NoSQL

CAP Theorem

GCP Storages & Databases
Non-serverless
Serverless

GCP Data Analytics
Pipeline Analytics Visualization

Credit Scoring Platform on Big Data Analytics

creditok.co

Why use serverless on big data?
●Scalable & super high performance
●No more server maintenance :)
●Easier to optimize
●Only pay per use

Requirements
●Have a HUGE data warehouse for batch processing
●Our customer have on-premise data on >400 sites
●Data ingestor app is needed to install to every site
●Data ingestor app must be able to run on
●Data ingestor app must be super robust and easy to install
●Must work automatically everyday, task scheduler

When >400 sites upload large files
to your server at the same time..

This is unintentional DDoS!

So we mainly use cloud function

●Auto scale
●But only accept <10 MB body size

and also use
Compute/App Engine
for >10MB files

Raw Data
Source
Raw Data
Source
Data Flow Architecture

Serverless
Big Data Architecture
In Summary
●Focus on design & coding
●Few people to achieve huge task
●No cost on idle server, pay as you use
(GCS storage ~$0.02 per GB)
●Processing cost is surprisingly low when optimized
(Beware of BigQuery cost!)

Beware of ZONE_RESOURCE_POOL_EXHAUSTED
●Serverless doesn’t mean no server, you just do not need to spawn servers/workers
●Worker pools have limit, do not run your app at the peak time (but when!!)
●Hopefully Google will solve the problem soon :)

We Are Hiring!

●PHP Laravel/Lumen Developer
●Data Engineer
●Credit Risk Analyst

[email protected]
https://jobs.blognone.com/company/creditok

Qu?s??o? & An??er

Time is short, let’s utilize the networks.
Feel free to connect with me via spicydog.me

Serverless Big Data Architecture on Google Cloud Platform at Credit OK

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Serverless Big Data Architecture on Google Cloud Platform at Credit OK

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 13

Slide 14

Slide 16

Slide 17

Slide 18

Slide 19

Slide 21

Slide 22

Slide 23

Slide 24

Slide 25

Slide 26

Slide 27

Slide 28

Slide 29

Slide 30

Slide 31

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

8-top-ai-courses-for-customer-support-representatives-in-2025.pptx

7-essential-ai-courses-for-call-center-supervisors-in-2025.pptx

25-essential-ai-courses-for-user-support-specialists-in-2025.pptx

8-essential-ai-courses-for-insurance-customer-service-representatives-in-2025.pptx

Know for Certain

PPT OPD LES 3ertt4t4tqqqe23e3e3rq2qq232.pptx