CRIU: are we there yet?
Transcript of CRIU: are we there yet?
SWsoft Corporate
Checkpoint-restore in userspace.
Are we there yet?
Pavel EmelyanovLinuxCon Europe 2012
What is C/R and what is it for?
C/R is an ability to snapshot an application state and restore it from the state at any time and place later.
Usage scenarios:Live migration
Reboot-less kernel update
Applications start-up boost
Working environment snapshots
HPC load balancing
...
Is it possible to do all these nice things now?
Yes!
Almost.
And we're close to it!
This talk answers on:
How shall we be able to do it?
How close to it are we?
How far from impossible to are we?
What has happened since then?
Everything is one the slides.
A brief history of C/R in Linux
2005
2008
2010
2011
Jan
OpenVZ project startswith live-migration support
all in kernel featureFirst collaborative attempt
to get C/R upstreamFirst more-or-less complete version
(over 100 patches)First attempt to do C/R
mostly in user-spaceLinus decided to merge
first set of patches upstreamCRIU v0.1CRIU v0.2+ LXC supportJul
Sep
2012
A brief C/R history openvz version, Oren's version, attempt to merge Oren's version upstream, CRIU proof-of-concept, Linus' OK, let's take it and first two releases
CRIU project ultimate goal
APPFSMMCredsTimers...APP
APP
share
dumpImage
00110110010010101110
11010110011011100111000101101101010111001011010110...restoreAPP
APP
APP
IPC...NetworkIPC...NetworkConsider you have an application.This application has a variety of resources associated with it: memory, open files, credentials, etc.There can be more than one application in a game, some of them sharing resources.And that's not all they may live in some environment (we call container, yes) with its own not bound to tasks resources like networking configuration or system V IPC objects.
What we do in CRIU is we serialize the state of this whole thing into an image file (well, it's a set of files, but still). Later we can take this image and recreate the applications with their resources and environment at the very same state as it was before we dumped it.
CRIU project concept
kernelAPPFDopenCRIUtoolWhat files are opened?~APPopenFDdump
restore
Existing kernel APIs
kernelProcSystem calls
About selfAbout anybodydump
restore
Netlink
How CRIU grows up
kernelAPPFOOGet FOOCRIUtoolInfo on FOO-s~APPGet FOO backFDdump
restore
?
X
Get FOO ++
Info FOO ++
CRIU project grow-up concept (Linus vision)
... this is a project by various mad Russians to perform c/r mainly from userspace, with various oddball helper code added into the kernel where the need is demonstrated.So rather than some large central lump of code, what we have is little bits and pieces popping up in various places which either expose something new or which permit something which is normally kernel-private to be modified...
Kernel impact
~110 patches merged
~15 patches in flight
9 new features appeared(1 C/R-only)
2 new features to come
The most interesting new features in kernel
Parasite code injectionRead task states, that are currently retrieved by a task only about himself
The kcmp system callHelps checking which kernel objects are shared between processes
Sockets information dumping via netlink (sock_diag)Extendable sockets state retrieving engine
TCP repair modeRead intimate state of a TCP connection
and reconstructs it from scratch on a freshly created socket
Other new features in kernel
Virtual net devices indicesAllows to restore network devices in a namespace
Proc map_files directoryFind out what exact file is mapped
Mappings sharing info
Socket peeking offsetAllows peeking sockets queues
(reading without removing data from queue)
More socket get-able sockoptionsBound device
Packet filter
CRIU features so far
IPC...NetworkX86_64 architectureProcess tree linkageMulti-threaded appsUNIX socketsLXC container environment
Terminals, groups and sessionsNon-posix files (inotify, epoll, etc.)Open files (+ shared and unlinked)
Kernel V3.6
Memory mappings of all kindsEstablished TCP connection
How we test it
ZDTM set of atomic tests for every new piece of functionality
Real softwareApache
MySQL
Make and gcc
Tar and gzip
Sshd with connections
Screen with top inside
VNC with xscreensaver and client connection
NGINX
MongoDB
tcpdump
Main plans for the nearest future
Full OS resources coverage
Merge in-flight patches, so that everything works on vanilla kernel
Properly integrate crtools with LXC and OpenVZ
Live-migration script
Pre-migrate app memory before freeze (speeds things up)
CRIU project resources
http://criu.org project news and documentation
http://git.criu.org git repo with tool sources
https://github.com/cyrillos/linux-2.6/ kernel with all in-flight patches applied
[email protected] mailing list
+CRIU page
Pavel Emelyanov
- Linked clones. Disk space. I/O performance.
GPL and ESXi
Parallels Optimized ComputingTM
Confidential