2012 Fall OpenStack Bare-metal Speaker Session

45
General Bare-metal Provisioning Framework Mikyung Kang, USC/ISI David Kang, USC/ISI Ken Igarashi, NTT docomo Mana Kenoko, NTT docomo Hiromichi Ito, Virtual Tech Japan Arata Notsu, Virtual Tech Japan

description

 

Transcript of 2012 Fall OpenStack Bare-metal Speaker Session

General Bare-metal

Provisioning Framework

Mikyung Kang, USC/ISI David Kang, USC/ISI

Ken Igarashi, NTT docomo Mana Kenoko, NTT docomo

Hiromichi Ito, Virtual Tech Japan Arata Notsu, Virtual Tech Japan

Overview ¡ Why General Bare-metal Provisioning? (USC/ISI) ¡  Why Bare-metal provisioning?

¡  OpenStack Bare-metal History: Essex – Folsom – Grizzly

¡  Bare-metal Provisioning Framework

¡  Bare-metal Release Plan

¡  Bare-metal provisioning support? (NTT Docomo) ¡  Instance Request

¡  Nova-compute Selection

¡  Image Provisioning

¡  Network Isolation

¡  Nova-volume Attachment

¡  VNC Access

¡  Snapshot

General Bare-Metal Provisioning Framework (Speaker Session)

2

Why Bare-metal Provisioning?

General Bare-Metal Provisioning Framework (Speaker Session)

3

¡ Manage Bare-metal Machines using OpenStack

Real-‐‑‒time  Analysis

Virtual Machines Bare-‐‑‒Metal  Machines

Various  CPU  support

OpenStack

Management  using  OpenStack

Why Bare-metal Provisioning?

General Bare-Metal Provisioning Framework (Speaker Session)

4

¡  Difference between VM and Bare-metal Machines ¡  Virtual Machines

¡  Hypervisor exists between physical resources and virtual machines ¡  Image provisioning, VM’s power management, volume isolation

(iSCSI), console access (VNC), VM’s snapshot

¡  Bare-metal Machines ¡  There is no hypervisor

¡  Bare-metal machine can access physical resources freely ¡  Need to achieve same security level as virtual environments

Virtual Machine

NW  StorageiSCSI

NWVLANHW

Hypervisor  (OpenStack)

CPU MEM   HDD NIC

Host  OSHW

CPU MEM   HDD NIC

OSimageDB

Bare-Metal Machine

NW  StorageiSCSI

NWVLAN

OS  imageDB

Why Bare-metal Provisioning?

General Bare-Metal Provisioning Framework (Speaker Session)

5

¡ Virtual machine vs. Bare-metal machine instances

Nova-Compute

CPU

MEM  

HDD

Aggregate resources bm1.medium

bm1.tiny

bare-metal Driver

HW

Hypervisor  

CPU MEM   HDD NIC

Host  OS

m1.tiny m1.medium m1.large

Nova-Compute (virtual)

Bare-metal machine Virtual machine

OpenStack Bare-metal History

General Bare-metal Provisioning Framework (Speaker Session)

6

• Non-PXE Tilera multi-core bare-metal machines

Essex Release: April 2012

• Non-PXE Tilera multi-core bare-metal machines • Pending review: PXE support & bare-metal MySQL DB

Folsom Release: Sept. 2012

• Finish review à merge to upstream: basic functions • New features including fault-tolerance and security

enhancement as well as scheduler changes

Grizzly Release: April 2013

OpenStack Bare-metal History ¡  Initial design for Tilera (Non-PXE) Image Provisioning (TFTP/NFS)

General Bare-metal Provisioning Framework (Speaker Session)

7

Essex

Folsom

Bare-metal Provisioning Framework

General Bare-metal Provisioning Framework (Speaker Session)

8

Compute Node w/ bare-metal plugin

LibvirtDriver (nova/virt/libvirt/driver.py)

Non-PXE (TILERA) (nova/virt/baremetal/tilera.py)

baremetal_driver ={baremetal.tilera.TILERA | baremetal.pxe.PXE }

1

2

nova/virt/ --- libvirt/ --- driver.py --- baremetal/ --- driver.py --- tilera.py --- tilera_pdu.py --- pxe.py --- ipmi.py

BareMetalDriver (nova/virt/baremetal/driver.py)

PXE (nova/virt/baremetal/pxe.py)

compute_driver=baremetal.driver.BareMetalDriver

PDU (nova/virt/baremetal/tilera_pdu.py)

IPMI (nova/virt/baremetal/ipmi.py)

power_manager ={baremetal.tilera_pdu.Pdu | baremetal.ipmi.Ipmi }

3

Tilepro64 ARM x86_64 instace_type_extra_specs =cpu_arch:xxx 4

Grizzly

¡  Bare-metal nova-compute vs. back-end machines

Bare-metal Provisioning Framework

General Bare-metal Provisioning Framework (Speaker Session)

9

Nova-scheduler Nova-compute w/ bare-metal plug-in

PXE: X86_64

PXE: ARM

Non-PXE: Tilera

Bare-metal back-end

x86_64 BM Farm

ARM BM Farm

Tilera BM Farm

cpu_arch=* hypervisor_type =baremetal

Bare-metal nodes

information Maximum Capability

Homogeneous Capability

Registers bare-metal resources

Bare-metal Provisioning Framework

General Bare-metal Provisioning Framework (Speaker Session)

10

Nova-Compute

CPU

MEM  

HDD

Aggregate resources bm1.medium

bm1.tiny

bare-metal Driver

Nova-Scheduler

TEXT

Including total number of bare-metal machines

Bare-metal Filter: cpu_arch &

hypervisor_type

Essex

Folsom

baremetal_sql_connection = mysql://$ID:$Password@$IP/nova_bm

Bare-metal MySQL DB

Registers bare-metal resources

Multiple Capabilities

Bare-metal Provisioning Framework

General Bare-metal Provisioning Framework (Speaker Session)

11

Nova-Compute

CPU

MEM  

HDD

Aggregate resources bm1.medium

bm1.tiny

bare-metal Driver

Nova-Scheduler

Bare-metal Filter: cpu_arch &

hypervisor_type

Grizzly

Bare-metal Release Plan

General Bare-metal Provisioning Framework (Speaker Session)

12

Grizzly-1: Nov. 22nd

Grizzly-3: Feb. 21st

Copyright©2011 NTT DOCOMO, INC. All rights reserved.

General Bare-Metal Provisioning Framework

Ken Igarashi Mana Kaneko

(NTT docomo Inc.)

DOCOMO, INC All Rights Reserved

0

2000

4000

6000

8000

10000

transmit receive

Thro

ughp

ut [M

bps]

Baremetal Virtual SR-IOV

0 10 20 30 40 50 60 70

2 4 8 16 24 32 64 96

Tim

e [µ

S]

Number of Process

Baremetal Virtual

0

0.1

0.2

0.3

0.4

64 1024 1500

Late

ncy

[ms]

Packet Size [bytes]

Baremetal SR-IOV Virtual

o CPU (Coremark)

o  TCP Throughput (Netperf)

o Context Switch (LMBench)

o  Ping

Benchmarking

14

0 20000 40000 60000 80000

100000 120000 140000 160000 180000

Baremetal Virtual Better

worse

Better worse

DOCOMO, INC All Rights Reserved

VM Provisioning Procedure in Nova

15

Glance

Hypervisor

Nova-API

1. Instance Request

Nova-Scheduler

Host OS

Nova-Compute

Hypervisor

Host OS

Nova-Compute

Hypervisor

Host OS

Nova-Compute

Storage Storage

Storage Storage

USER1 Vol-13

USER1 Vol-14

USER2 Vol-11

USER2 Vol-12

Nova-Volume

DOCOMO, INC All Rights Reserved

VM Provisioning Procedure in Nova

16

Glance

Hypervisor

Nova-API

1. Instance Request

Nova-Scheduler

Host OS

Nova-Compute

Hypervisor

Host OS

Nova-Compute

Hypervisor

Host OS

Nova-Compute

2. Choose Nova-Compute

Storage Storage

Storage Storage

USER1 Vol-13

USER1 Vol-14

USER2 Vol-11

USER2 Vol-12

Nova-Volume

DOCOMO, INC All Rights Reserved

VM Provisioning Procedure in Nova

17

Glance

Hypervisor

Nova-API

1. Instance Request

Nova-Scheduler

Host OS

Nova-Compute

Hypervisor

Host OS

Nova-Compute

Hypervisor

Host OS

Nova-Compute

2. Choose Nova-Compute

VM VM VM

Storage Storage

Storage Storage

USER1 Vol-13

USER1 Vol-14

USER2 Vol-11

USER2 Vol-12

Nova-Volume

VM

3. Image Provisioning

DOCOMO, INC All Rights Reserved

VM Provisioning Procedure in Nova

18

Glance

Hypervisor

Nova-API

1. Instance Request

Nova-Scheduler

Host OS

Nova-Compute

Hypervisor

Host OS

Nova-Compute

Hypervisor

Host OS

Nova-Compute

2. Choose Nova-Compute

VM VM VM

Storage Storage

Storage Storage

USER1 Vol-13

USER1 Vol-14

USER2 Vol-11

USER2 Vol-12

Nova-Volume

VM

3. Image Provisioning

4. Network Isolation

DOCOMO, INC All Rights Reserved

VM Provisioning Procedure in Nova

19

Glance

Hypervisor

Nova-API

1. Instance Request

Nova-Scheduler

Host OS

Nova-Compute

Hypervisor

Host OS

Nova-Compute

Hypervisor

Host OS

Nova-Compute

2. Choose Nova-Compute

VM VM VM

Storage Storage

Storage Storage

USER1 Vol-13

USER1 Vol-14

USER2 Vol-11

USER2 Vol-12

Nova-Volume

VM

3. Image Provisioning

4. Network Isolation

5. Nova-Volume Attachment

DOCOMO, INC All Rights Reserved

VM Provisioning Procedure in Nova

20

Glance

Hypervisor

Nova-API

1. Instance Request

Nova-Scheduler

Host OS

Nova-Compute

Hypervisor

Host OS

Nova-Compute

Hypervisor

Host OS

Nova-Compute

2. Choose Nova-Compute

VM VM VM

Storage Storage

Storage Storage

USER1 Vol-13

USER1 Vol-14

USER2 Vol-11

USER2 Vol-12

Nova-Volume

VM

3. Image Provisioning

4. Network Isolation

5. Nova-Volume Attachment

6. VNC Access

DOCOMO, INC All Rights Reserved

VM Provisioning Procedure in Nova

21

Glance

Hypervisor

Nova-API

1. Instance Request

Nova-Scheduler

Host OS

Nova-Compute

Hypervisor

Host OS

Nova-Compute

Hypervisor

Host OS

Nova-Compute

2. Choose Nova-Compute

VM VM VM

Storage Storage

Storage Storage

USER1 Vol-13

USER1 Vol-14

USER2 Vol-11

USER2 Vol-12

Nova-Volume

VM

3. Image Provisioning

4. Network Isolation

5. Nova-Volume Attachment

6. VNC Access

glance

7. Snapshot

AMI AMI

DOCOMO, INC All Rights Reserved

Bare-Metal Provisioning Functions o We need to implement same functions for bare-metal

provisioning 1.  Instance Request – Description for bare-metal machine instances 2.  Choose Nova-Compute – Scheduler for bare-metal machines 3.  Image Provisioning – Turn on/off and deploy images to bare-metal

machines 4.  Network Isolation – Create private LAN among bare-metal

machines 5.  Nova-Volume Attachment – Provide secure iSCSI access 6.  VNC Access – Provide console access to bare-metal servers 7.  Snapshot – Create new AMI from a running VM

22

How to achieve those functions without hypervisor? Keep

Compatibility (Same API)

Less impact to Nova

DOCOMO, INC All Rights Reserved

1. Instance Request o Create instance types for bare-metal machines

o  bare-metal machine instances have

“instance_type_extra_specs”

Ø  euca-run-instances –t m1.tiny -> Create virtual instance Ø  euca-run-instances –t b1.tiney -> Create bare-metal instance

23

Name Id memory_mb VCPUS local_gb m1.tiny 1 512 1 40 m1.medium 2 4096 2 80 b1.tiny 3 512 1 40 b1.medium 4 4096 2 80

Id key value 3 cpu_arch tilepro64 4 cpu_arch x86_64

DOCOMO, INC All Rights Reserved

2. Choose Nova-Compute (Sceduler) o  Create pseudo Nova-Computes for bare-metal machines

o  Filter scheduler can classify virtual and bare-metal machines

24

CPU

MEM

HDD

Nova-Scheduler

HW

Hypervisor 

CPU MEM HDD NIC

Host OS

HW

Hypervisor 

CPU MEM HDD NIC

Host OS

b1.midium

b1.tiny

b1.tiny

b1.midium

m1.tiny

m1.large

Nova-API

b1.tiny m1.midium

bare-metal Driver

HW

Hypervisor 

CPU MEM HDD NIC

Host OS

m1.tiny m1.midium m1.large

Nova-Compute (virtual)

Filter Scheduler

Bare-Metal

Virtual

m1.large

Nova-Compute

DOCOMO, INC All Rights Reserved

3. Image Provisioning (x86_64) 0. Preparation

25

nova-compute

Create “kernel + ramdisk”, and register them to glance

“baremetal-mkinitrd.sh”

glance

AKI ARI

Edit nova.conf

compute_driver=nova.virt.baremetal.driver.BareMetalDriver baremetal_driver=nova.virt.baremetal.pxe.PXE power_manager=nova.virt.baremetal.ipmi.Ipmi baremetal_deploy_ramdisk = 843adb6d-e0f8-452d-9a60-d8c883a0983c baremetal_deploy_kernel = 7dfd792c-fc85-480e-8d07-7d9b20d58c24

Run bare-metal deployment servers

- dnsmasq (PXE server) - bm_deploy_server

Specify nova-compute type

Driver for nova-compute and power manager

AKI and ARI for 1st boot

DOCOMO, INC All Rights Reserved

3. Image Provisioning (x86_64) 1.  1st Boot

2.  System Setup

26

PXE boot Use kernel/ramdisk for the deployment

Nova-Scheduler

Nova-API

b1.tiny

euca-run-instances –t b1.tinyl --ramdisk ari-bare (–kernel aki-bare) ami-bare

AKI (deploy)

ARI (deploy)

Send AMI via iSCSI

nova-compute/ PXE server

AMI-bare nova-compute/

bm_deploy_server Read Configuration (Nova-Network)

MAC and IP Address 1.  Create File system (SWAP) 2.  Configure MAC and IP address 3.  Setup PXE for 2nd boot 4.  Reboot

Bare-Metal Machines

DOCOMO, INC All Rights Reserved

3. Image Provisioning (x86_64) 3.  2nd Boot

27

aki-Bare ari-Bare

nova-compute/ PXE server

AMI-bare

aki-Bare ari-Bare

PXE boot Use kernel/ramdisk for the

provisioning

Boot from Local HDD

euca-run-instances –t b1.tinyl --ramdisk ari-bare (–kernel aki-bare) ami-bare

Bare-Metal Instance

DOCOMO, INC All Rights Reserved 28

o  Virtual Machines Ø Hypervisor checks addresses (IP

and MAC), and puts VLAN tag

o  Bare-Metal Machines Ø  Use can change address and VLAN

tag freely

4. Network Isolation

MW-d

OS-d

APL-d

IP address spoofing!

(pretend others)

HW

MW-d

OS-d

APL-d

Hypervisor

OK NG

MW-d

OS-d

APL-d

MAC, IP address, VLAN spoofing!

(pretending others)

HW

HW

Hypervisor

DOCOMO, INC All Rights Reserved

4. Network Isolation (β version) o Use Quantum – NEC’s Trema + OpenFlow Switch

Ø Protect against address spoofing (MAC and IP) Ø Create a private network among instances

29

Nova-Compute (bare-metal) Quantum

OpenFlow Controller (Trema from NEC)

of_in_port=<switch’s port> src_mac != <Instance's MAC> -> DROP

of_in_port=<switch’s port> src_ip != <Instance's IP> -> DROP

of_in_port=* dst_ip=<Instance's IP> protocol and dst_port Allowed by security group ->

ALLOW of_in_port=* dst_ip=<BROADCAST> protocol

and dst_port Allowed by security group -> ALLOW

Security Group A

Security Group A

Security Group B Security Group B OpenFlow

Switch

DOCOMO, INC All Rights Reserved 30

o  Virtual Machines Ø Nova-Volume is transparent to

users

o  Bare-Metal Machines Ø  Use can see all Nova-Volumes

5. Nova-Volume Attachment

MW-d

OS-d

APL-d

HW

Storage USER2 Vol-13 Storage USER1

Vol-14 Storage USER4 Vol-11 Storage USER3

Vol-12

Nova-Volume

MW-d

OS-d

APL-d

HW

MW-d

OS-d

APL-d

Hypervisor

HW

Hypervisor

iscsiadm –m discovery

iscsiadm –m discovery

Don’t work! Can see all the volumes

DOCOMO, INC All Rights Reserved

5. Nova-Volume Attachment (β version) o Use Nova-Compute as a proxy of Nova-Volume

Ø Separate Nova-Volume network and provide ACL using CHAP

31

Storage Storage

Storage Storage

USER1 Vol-13

USER2 Vol-14

USER3 Vol-11

USER4 Vol-12

Nova-Volume

Server A

Server B

Server C

Nova-Volume Network

OpenFlow Switch

1. Isolate iSCSI netowrk

Server D

Bare-Metal Nova Volume Network

2. Provide ACL for each bare-metal machines

DOCOMO, INC All Rights Reserved

6. VNC Access (β version) o  Provide console access by Serial over LAN (SOL)

o Use Ajax Console (shellinabox)

32

Nova-Compute Bare-metal

SOL

Serial Console

http://code.google.com/p/shellinabox/

DOCOMO, INC All Rights Reserved

Bare-Metal Provisioning 1.  Instance Request

- Create new instance type with “extra_specs = bare-metal”

2.  Choose Nova-Compute - Create new scheduler called “Heterogeneous Scheduler”

3.  Image Provisioning - Use Intel vPro and IPMI to Turn on/off bare-metal machines

4.  Network Isolation - Use Quantum (OpenFlow) to protect against address spoofing and create a private LAN within a security group

5.  Nova Volume Attachment - Network ACL (VLAN and CHAP)

6.  VNC Access -  Serial over LAN

7.  Snapshot - TBD

33

DOCOMO, INC All Rights Reserved

Libvirt and Bare-Metal Driver

34

Category Operation Libvirt Bare-Metal

Instance

Activate O O (IPMI)

Reboot O O (IPMI)

Suspend O X

Terminate O O (IPMI)

MAC/IP Address O O (Deploy Ramdisk)

Floating IP O O Snapshot O X

Security Security Groups O O (OpenFlow)

Keypair O O

Console O (VNC) △ (SOL)

o Compare operations supported by Horizon

Copyright©2011 NTT DOCOMO, INC. All rights reserved.

Demonstration Mana Kaneko

(NTT docomo Inc.)

DOCOMO, INC All Rights Reserved

Implemented Functions

36

Copyright©2011 NTT DOCOMO, INC. All rights reserved.

Scaling the Nova-Compute using Zabbix

General Bare-Metal

Provisioning Framework

DOCOMO, INC All Rights Reserved

Bare-Metal Machine Provisioning o Manage Bare-Metal Machines same as Virtual Machines

Ø  Run an instance through OpenStack API

ü euca-run-instances –t b1.tinyl --ramdisk ari-bare (–kernel aki-bare) ami-bare

38

Virtual Machines

Bare-Metal Machines

Open Stack

Management using OpenStack

Utilize all the ecosystem created on top of OpenStack

Auto-Scaling

DOCOMO, INC All Rights Reserved

Auto-Scaling of the Nova-Compute o Change resources dynamically based on load

39

Common Computing Pool

Common Computing Pool

DOCOMO, INC All Rights Reserved

How Does Zabbix Scale a Nova-Compute?

40

Nova-Compute Zabbix

Item1, Item2

ITEM

Total CPUs Total Memory

Total Disk etc…

System Information

“Item1” = Total CPUs

Zabbix argent

VM's CPU load

Total vCPUs VM’s Memory

VM’s Disk etc…

Management

“Item2” = Total vCPUs Collectd Libvirt

Plugin

VM VM

VM

Zabbix Plugin

Scale-out Trigger Scale-in Trigger

TRIGGER

Scale-out Action Scale-in Action

ACTION

Information from Libvirt

V M

H O S T

DOCOMO, INC All Rights Reserved

Trigger & Action for scaling the Nova-Compute

41

Trigger List Expression Value

Scale-out Total vCPUs.ave(60) > Total CPUs True : PROBLEM

False : OK

Scale-in Total vCPUs.ave(180)

< Total CPUs - number of CPUs per server

True : PROBLEM

False : OK

Item List Item1 Total CPUs

Item2 Total vCPUs

Action List Value Status Operation

Scale-out PROBLEM Execute “euca-run-instances~” command to Nova-api

Scale-in PROBLEM Execute “euca-terminate-instances~” command to Nova-api

DOCOMO, INC All Rights Reserved

Scaling Nova-Compute

42

DOCOMO, INC All Rights Reserved

Bare-metal codes for submission o Updated scheduler and compute for multiple bare-metal

capabilities Ø  https://review.openstack.org/13920

o  Added separate bare-metal MySQL DB Ø  https://review.openstack.org/10726

o  A script for bare-metal node management Ø  https://review.openstack.org/#/c/11366/

o  Updated bare-metal provisioning framework Ø  https://review.openstack.org/11354

o  Added PXE back-end bare-metal Ø  https://review.openstack.org/11088

o  Added bare-metal host manager Ø  https://review.openstack.org/11357

43

DOCOMO, INC All Rights Reserved

Bare-metal docs

OpenStack Wiki •  http://wiki.openstack.org/

GeneralBareMetalProvisioningFramework

OpenStack Source •  nova/virt/baremetal/docs/*.rst •  README and installation documents

The Latest Github branch •  https://github.com/NTTdocomo-openstack/

nova/

44

44

DOCOMO, INC All Rights Reserved

o Contact: USC/ISI & NTT Docomo o  Interested companies: collaboration / testing

Design & Implementation

Bare-metal provisioning interests

summit session

meetup

Tuesday @4:30-5:10pm

[Emma AB]

45