Cloning Running Servers with Docker and CRIU by Ross Boucher

Docker 4,938 views 31 slides Jun 24, 2016
Slide 1
Slide 1 of 31
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31

About This Presentation

Docker containers encapsulate everything you need to describe and run a process, but the lifecycle of a process remains the same: it starts, it runs for a while, and then it ends. This talk will demonstrate how to combine Docker with a tool called CRIU to “roll-back” running processes to an earl...


Slide Content

Cloning Running Servers
Ross
Boucher
@boucher

$ node
>> 2 + 2
4
>>
Browser Server
2 + 2
4

x = 1
1
++x
2
x == 2
true
[1]
[2]
[3]
x = 1
1
++x
2
x == 2
true
[4]
[2]
[3]

(Checkpoint Restore In Userspace)
CRIU

CRIU
+

Application
Server
Evaluation
Controller
Public Internet Internal Network Isolated Network
Container PoolUser’s Browser
Primary Docker Eval Docker
2 + 2
4

message: "evaluate"
nodeVersion: "4.x.x"
sources: [{
type: "source",
text: "var x = 2 + 2",
checksum: "a8efd"
}]
url: "/users/boucher/repositories/12345/branches/master"
2 + 2 Application
Server

message: "get-evaluator"
nodeVersion: "4.x.x"
checksums: [ "a8efd" ]
url: "/users/boucher/repositories/12345/branches/master"
2 + 2
Evaluation
Controller
Application
Server

function get_evaluator(configuration) {
if (cached_containers[configuration]) {
return cached_containers[configuration]
}
if (checkpoint_exists(configuration)) {
return restored_container(configuration);
}
return pooled_container(configuration)
}
2 + 2
Evaluation
Controller

message: "found-evaluator"
IP: "172.0.1.201"
port: "7777"
Evaluation
Controller
Application
Server

message: "establish-connection"
message: "get-source"
index: 0
message: "source-at-index"
index: 0
source: "var x = 2+2;"
message: "output"
value: 4
Application
Server

message: "output"
index: 0
value: 4
2 + 2 Application
Server4

message: "checkpoint-evaluator"
checksum: "a8efd"
url: "/users/boucher/repositories/12345/branches/master"
2 + 2
Evaluation
Controller
Application
Server

await container.checkpoint({
LeaveRunning: true,
ImagesDirectory: “/checkpoints/<document_id>/<checksum>/”
})
if (await container.changes().length > 0) {
metadata.image = await container.commit({ pause: false })
}
await fs.writeFile(metadata,
“/checkpoints/<document_id>/<checksum>/metadata.txt” )
2 + 2Evaluation
Controller

# current cli
$ docker checkpoint --leave-running --image-dir= /checkpoint/path
<container_id>
$ docker commit --pause=false <container_id>
# new cli
$ docker checkpoint --exit=false <container_id> <checkpoint_id>
2 + 2Evaluation
Controller

message: "get-source"
...
message: "source-at-index"
...
message: "finished"
(process exited)
Application
Server

message: "predict-evaluator"
checksums: [ "a8efd" ]
url: "/users/boucher/repositories/12345/branches/master"
2 + 2
Evaluation
Controller
Application
Server

function predict_evaluator(configuration) {
var cpPath = “/checkpoints/<document_id>/<checksum>/” ;
var metadata = await fs.readFile(metadata, cpPath + “/metadata.txt”);
var container = await docker.restore({
ImagesDirectory: cpPath
...
});
cached_containers[configuration] = container;
}
2 + 2Evaluation
Controller

$ docker create tonic/worker
<container_id>
# current cli
$ docker restore --force —image-dir= /checkpoint/path <container_id>
# new cli
$ docker start --checkpoint <checkpoint_id> <container_id>
2 + 2Evaluation
Controller

message: "evaluate"
nodeVersion: "4.x.x"
sources: [
{ type: "source", text: "var x = 2 + 2", checksum: "a8fed"},
{ type: "source", text: "x++", checksum: "b9ccc" }
]
url: "/users/boucher/repositories/12345/branches/master"
Application
Server
2 + 2
4
x++

function get_evaluator(configuration) {
if (cached_containers[configuration]) {
return cached_containers[configuration]
}
if (checkpoint_exists(configuration)) {
return restored_container(configuration);
}
return pooled_container(configuration)
}
2 + 2
Evaluation
Controller

•Drop any capabilities you don’t need
•Set CPU, memory, and network constraints
•User Namespaces
•Network Isolation
•Seccomp
Security Concerns

•AUFS errors
•CRIU failures
•Race conditions
•Zombie processes
•Docker daemon restarts
•Filesystem management
Security Concerns

github.com/boucher/dockworker
DockWorker

(and other potential use cases)
Container Migration

!Pre-compiled Release (based on Docker 1.10)
!Checkpoint/Restore Pull Request
!Saied Kazemi’s Linux Plumber’s Talk
!CRIU Homepage
!DockerCon Doom Demo
!Tonic Blog on checkpoint/restore
!DockWorker on Github
!Using P.Haul with Docker
!DockerScript is a cool tool we use to manage our images
Further Reading

Thanks!
@boucher • [email protected]