AmazonWebServicesLATAM
1,854 views
79 slides
Jun 08, 2015
Slide 1 of 79
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
About This Presentation
Containers para Software! A mais nova revolução, trazida ao mundo pela Dockers, rodando hoje na AWS. Venha conhecer esta inovadora e revolucionária tecnologia que vai mudar a forma como você desenvolve e implementa software.
Size: 6.16 MB
Language: pt
Added: Jun 08, 2015
Slides: 79 pages
Slide Content
São Paulo
Rodandocontainers Dockerna
AWS
Rafael Felix Correa
Consultant, AWS Professional Services
Static website
Web frontend
User DB
QueueAnalytics DB
Background workers
API endpoint
nginx1.5 + modsecurity+ openssl+
bootstrap 2
postgresql+ pgv8 + v8
hadoop+ hive + thrift + OpenJDK
Ruby + Rails + sass + Unicorn
Redis+ redis-sentinel
Python 3.0 + celery + pyredis+ libcurl+ ffmpeg+
libopencv+ nodejs+ phantomjs
Python 2.7 + Flask + pyredis+ celery + psycopg+
postgresql-client
VM de desenvolvimento
Servidorde QA
Public Cloud
Disaster recovery
Notebook do funcionário
Servidoresde produção
O desafio
Variedade
de
Stacks
Variedade
de
ambientes
Cluster de produção
Data Center do Cliente
Serviços
e
aplicaçõesinteragem
apropriadamente
?
Consigo
migrar
rápido
e
sem
problemas
?
A matrizdo inferno
Static website
Web frontend
Background workers
User DB
Analytics DB
Queue
VM de
desenvolvim
ento
Servidorde
QA
Servidor
únicode
Prod
Cluster
interno
Public Cloud
Notebook do
funcionário
Servidores
do cliente
? ? ? ? ? ? ?
? ? ? ? ? ? ?
? ? ? ? ? ? ?
? ? ? ? ? ? ?
? ? ? ? ? ? ?
? ? ? ? ? ? ?
Variedade
de
bens
Variedade
de
meios
de
transporte
e
armazenamento
Devo
me
preocupar
sobre
como
os
bens
interagem
?
(
exemplo
:
grãos
de
café
próximos
às
especiarias
)
Consigo
transportar
rápido
sem
problemas
?
(
exemplo
: do
navio
para
o
trem
para
o
caminhão
)
Transportede cargaantes de 1960
Variedade
de
bens
Variedade
de
meios
de
transporte
e
armazenamento
Devo
me
preocupar
sobre
como
os
bens
interagem
?
(
exemplo
:
grãos
de café
próximos
à
especiarias
)
Consigo
transportar
rápido
e
sem
problemas
? (
exemplo
:
do
navio
para
o
trem
para
o
caminhão
)
Solução: Container padronizadoparatransporte
…no caminho, podeser
carregado, descarregado,
empilhado, transportado
porgrandesdistânciase
transferidode umaforma
de transporteparaoutra
Um container padrão
carregadocom virtualmente
qualquerbem, se mantém
seladoatéchegarno destino
final.
Static website Web frontend User DB Queue Analytics DB
VM de
desenvolvimen
to
Servidorde QA
Public Cloud
Notebook
do
funcionário
Dockeréum sistemade enviode container para
código
Variedade
de
Stacks
Variedade
de
ambientes
Cluster de
produção
Data Center
do Cliente
Serviços
e
aplicações
interagem
apropriadamente
?
Consigo
migrar
rápido
e
sem
problemas
?
…quepodesermanipulado
e operadode forma
consistenteemvirtualmente
qualquerplataformade
hardware
Uma engine quepermite
qualquerconteúdoser
encapsuladocomoum
container leve, portável
e auto-suficiente…
Static website
Web frontend
Background workers
User DB
Analytics DB
Queue
VM de
desenvolvim
ento
Servidorde
QA
Servidor
únicode
Prod
Cluster
interno
Public Cloud
Notebook do
funcionário
Servidores
do cliente
Dockereliminaa matrizdo inferno
O queéum container?
?!?
•Conjuntode featuresdo kernel do Linux queisolamprocessos, user
ids, memória, I/O e network
•“chrootcom esteróides!”
•Alocaçãode recursosgranular
•Namespaces, Cgroups
•Essasfeaturesexisteme tem sidousadasháanos(Solaris “Zones”)
App
A
Hypervisor
Host OS
Server
Guest
OS
Bins/
Libs
App
A’
Guest
OS
Bins/
Libs
App
B
Guest
OS
App A’
Docker
Host OS
Server
Bins/Libs
App A
Bins/Libs
App BApp B’App B’App B’
VM
Container
Containers sãoisolados,
mas compartilhamo SO e,
quandose aplica, bins/libs
Guest
OS
Guest
OS
…o quefazcom queoscontainers
possamserconsideradoscomo“VMs
enxutas”.
Bins/
Libs
Issopareceumamáquinavirtual (VM)!
SobreDocker
•Dockeréumaplataformacompletaparase construire
rodarcontainers
•Foidisponibilizadocomoopen source emMarçode
2013 peladotCloud(atualmenteDocker, Inc.)
•Incluium CLI, ferramentasparacriaçãode imagensde
containers (Dockerfiles), e um repositórioparahospedar
containers
About MercadoLibre
•Operations in 13 countries.
•Leaders in the region.
•6.5 Billion MktCap(NASDAQ: MELI).
•2600 Employees (600 engineers).
•8
th
e-commerce site by traffic in the world.
About Me
•DaríoSimonassi
–Architecture Sr. Manager
–MercadoLibre.Com
–@ldsimonassi
About Me: Lately
•DaríoSimonassi
–Architecture Sr. Manager
–MercadoLibre.Com
–@ldsimonassi
MercadoLibre’stechnology stack
•MicroservicesArchitecture.
•REST API & Frontends.
•650 Applications.
•2 OnPremisedatacenters in the US.
•Private cloud: OpenStack15.000 VMs
•Fully self provisioned infrastructure services.
Frontends
REST API
MercadoLibre’sArchitecture
Items Users OrdersQuestions
Home Search
Item
Page
Checkout
MercadoLibre’sengineering teams
•Small teams.
•Full control & Ownership:
–Product decisions
–Development.
–QA (Automated).
–Deployment.
–Alerts & Production management.
Current technology stack
opportunities
•Great for:
–Empowerment.
–Flexibility.
•Opportunities:
–Complex.
–Lot of tools.
–Lot of knowledge required.
–Policies are difficult to enforce.
Operations
•Code
•Code, run, play & execute tests locally.
•Deploy
•CI, Fast & Safe code shipping.
•Monitoring & Alerting
•Know when something is going wrong.
•Understand what is going wrong.
•Notify the correct person.
•Follow up.
•Troubleshooting
•Take action to solve a problem.
Code
•Developers machine setup.
•Local setup for DB & Mock servers.
•Versions headache.
•Environment changes propagation.
•CI environment consistency.
•Developers don’t manage CI servers.
•What works in my machine must work in the CI.
•Environments fidelity.
•Use same libraries, same binaries.
Grails
SDK
Redis
Node.JS
API Mock
Browser
Fury CLI
$> fury run
Developer Machine
Text Editor
docker-compose.yml
Source Code
Code: Fury CLI Commands
•$> fury get homepage
•Will clone the homepage project repository.
•$> fury run
•Will run the webserver with all the dependencies locally.
•$> fury test
•Will run tests locally with all the dependencies.
•$> fury create-version 0.1
•Will push the code.
•Will tag the version.
•Will trigger CI and open browser window.
Deploy
•Risk of downtime
–Smooth transitions between versions.
–Real time metrics.
–Fast rollback.
•Reliability & Fidelity
–New Infrastructure vsUpdate scripts.
•Hardware usage efficiency
–Autoscaling.
Production Instance
•No Updates
•New application version -> New hardware
•No States
•No states like: on_duty, deploying.
•Instances are always supposed to be working.
•Single AMI
•Only one AMI that will run a container inside.
•The same container for the entire instance life.
•Minimal
•Only required stuff in this AMI.
ELB
Deployment: In progress 100%
Fast Rollback Spare Pool
ELB
Deployment: Finishing
ELB
Deployment: Done
Monitoring & Alerting
•Alerting:
–To know fast and reliably when something is going wrong:
•No false alarms.
•Instance health no longer suitable.
•Notify the right persons & follow up.
•Diagnostics:
–To be able to understand what is going wrong.
•Metrics
–Multidimensional.
–Real-Time.
•Logs.
Alerting: Based on metrics
•HTTP:
–Errors.
–Response time.
•Infrastructure:
–How many machines stopped working unexpectedly during
the last hour?
•Business:
–Is the application doing what it should?
–Compare vsyesterday / last week.
Alerting: Based on metrics
Alerting: Based on metrics
Alerting: No false alarms
•Reasonable thresholds.
–Undesirable but expected situations shouldn’t be an alert.
–You won’t improve your application SLA on Saturday 3AM.
•Different thresholds for different periods.
–Alert on > 25% errors in the last minute.
–Alert on > 5% errors in the last hour.
–Alert on > 0.2% errors in the last day.
Alerting: Notifications.
•Teams manage the duty schedule.
•Escalation & Backup is necessary.
•Automatic Alerts follow up.
•Flexibility & Mobile.
–“I’m going to watch a movie, can someone get my
duties for the next three hours?”
Alerting: Notifications.
Diagnostics: Metrics
•Real Time.
–You’ll be changing things and you’ll need feedback.
•Multidimensional
–Tagged metrics will really make the difference when trying
to identify the failure.
Diagnostics: Metrics key features
Diagnostics: Logs
•Useful logs
–Events.
–Unusual errors.
–Avoid traces.
•Tagged
–Good tagging will allow you to make sense of the logged
data.
Logs: Tagging
Troubleshooting: Actions
•Deploy
–We’ve deployed a bug.
–We need to rollback fast or deploy a new version fast.
•Scale
–There is a performance issue, we need to increase the compute power.
–There is a specific bug, we need to block traffic.
•Restart
–There is a leak (memory, connections etc).
–Application restarting is a suitable yet temporary workaround for
those cases.