Meet Up August 9 at 6:30pm @warehousedcfiles.meetup.com/2979972/OpenStack DC_August 9th.pdfTHANK YOU...

41
OpenStack DC Meet Up August 9 th at 6:30pm @warehousedc www.meetup.com/OpenStackDC www.twitter.com/OpenStackDC

Transcript of Meet Up August 9 at 6:30pm @warehousedcfiles.meetup.com/2979972/OpenStack DC_August 9th.pdfTHANK YOU...

OpenStack DC Meet Up

August 9th at 6:30pm @warehousedc

www.meetup.com/OpenStackDC

www.twitter.com/OpenStackDC

THANK YOU TO OUR SPONSOR,

Meet our OpenStack DC Organizers Haisam Ido

Kapil Thangavelu

Matthew Metheny

Eric Mandel

Jason Ford

Kenna McCabe

Ryan Day

PRESENTATIONS

"Ansible, Vagrant and OpenStack on your laptop”

by Lorin Hochstein (@lhochstein)

"High-Performance, Heterogeneous Computing and OpenStack"

by David Kang, Cloud Computing and HPC Engineer at

University of Southern California / Information Sciences

Institute

Vagrant, Ansible and OpenStack on your laptop

Lorin Hochstein Nimbis Services

Email: [email protected] Twitter: @lhochstein

Setting up OpenStack for production is complex and error-prone

2012-08-04 12:31:56 INFO nova.rpc.common [-] Reconnecting to AMQP server on localhost:5672

2012-08-04 12:31:56 ERROR nova.rpc.common [-] AMQP server on localhost:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in

30 seconds.

2012-08-04 12:31:56 TRACE nova.rpc.common Traceback (most recent call last):

2012-08-04 12:31:56 TRACE nova.rpc.common File "/usr/lib/python2.7/dist-packages/nova/rpc/impl_kombu.py", line 446, in reconnect

2012-08-04 12:31:56 TRACE nova.rpc.common self._connect()

2012-08-04 12:31:56 TRACE nova.rpc.common File "/usr/lib/python2.7/dist-packages/nova/rpc/impl_kombu.py", line 423, in _connect

2012-08-04 12:31:56 TRACE nova.rpc.common self.connection.connect()

2012-08-04 12:31:56 TRACE nova.rpc.common File "/usr/lib/python2.7/dist-packages/kombu/connection.py", line 154, in connect

2012-08-04 12:31:56 TRACE nova.rpc.common return self.connection

2012-08-04 12:31:56 TRACE nova.rpc.common File "/usr/lib/python2.7/dist-packages/kombu/connection.py", line 560, in connection

2012-08-04 12:31:56 TRACE nova.rpc.common self._connection = self._establish_connection()

2012-08-04 12:31:56 TRACE nova.rpc.common File "/usr/lib/python2.7/dist-packages/kombu/connection.py", line 521, in

_establish_connection

2012-08-04 12:31:56 TRACE nova.rpc.common conn = self.transport.establish_connection()

2012-08-04 12:31:56 TRACE nova.rpc.common File "/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line 255, in

establish_connection

2012-08-04 12:31:56 TRACE nova.rpc.common connect_timeout=conninfo.connect_timeout)

2012-08-04 12:31:56 TRACE nova.rpc.common File "/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line 52, in

__init__

2012-08-04 12:31:56 TRACE nova.rpc.common super(Connection, self).__init__(*args, **kwargs)

2012-08-04 12:31:56 TRACE nova.rpc.common File "/usr/lib/python2.7/dist-packages/amqplib/client_0_8/connection.py", line 129, in

__init__

2012-08-04 12:31:56 TRACE nova.rpc.common self.transport = create_transport(host, connect_timeout, ssl)

2012-08-04 12:31:56 TRACE nova.rpc.common File "/usr/lib/python2.7/dist-packages/amqplib/client_0_8/transport.py", line 281, in

create_transport

2012-08-04 12:31:56 TRACE nova.rpc.common return TCPTransport(host, connect_timeout)

2012-08-04 12:31:56 TRACE nova.rpc.common File "/usr/lib/python2.7/dist-packages/amqplib/client_0_8/transport.py", line 85, in

__init__

2012-08-04 12:31:56 TRACE nova.rpc.common raise socket.error, msg

2012-08-04 12:31:56 TRACE nova.rpc.common error: [Errno 111] ECONNREFUSED

You're looking for better ways to do deployment

Shell scripts are painful, Puppet & Chef have steep learning curves

if [[ $EUID -eq 0 ]]; then

ROOTSLEEP=${ROOTSLEEP:-10}

echo "You are running this script as root."

echo "In $ROOTSLEEP seconds, we will create a user 'stack' and run as that user"

sleep $ROOTSLEEP

# since this script runs as a normal user, we need to give that user

# ability to run sudo

if [[ "$os_PACKAGE" = "deb" ]]; then

dpkg -l sudo || apt_get update && install_package sudo

else

rpm -qa | grep sudo || install_package sudo

fi

if ! getent passwd stack >/dev/null; then

echo "Creating a user called stack"

useradd -U -s /bin/bash -d $DEST -m stack

fi

Source: devstack/stack.sh

You want an easy way to write & debug deployment scripts

Use Ansible to write OpenStack deployment scripts, Vagrant to test

them inside of VMs

Ansible big idea: very simple syntax, SSH for communication

Example Ansible play: install ntp

---

- hosts: controller

tasks:

- name: ensure ntp packages is installed

action: apt pkg=ntp

- name: ensure ntp.conf file is present

action: copy src=files/ntp.conf dest=/etc/ntp.conf

owner=root group=root mode=0644

- name: ensure ntp service is restarted

action: service name=ntp state=restarted

Specify hosts in an inventory file

[controller]

192.168.206.130

[compute]

192.168.206.131

192.168.206.132

192.168.206.133

192.168.206.134

Run the playbook

$ ansible-playbook ntp.yaml

PLAY [controller] *********************

GATHERING FACTS *********************

ok: [192.168.206.130]

TASK: [ensure ntp packages is installed] *********************

ok: [192.168.206.130]

TASK: [ensure ntp.conf file is present] *********************

ok: [192.168.206.130]

TASK: [ensure ntp service is restarted] *********************

ok: [192.168.206.130]

PLAY RECAP *********************

192.168.206.130 : ok=4 changed=3

unreachable=0 failed=0

What did Ansible just do?

1. Made SSH connections to remote host

2. Copied over Python modules and arguments parsed from playbook file

3. Executed modules on remote machine

Can run a single action using ansible command

$ ansible controller –m apt –a "pkg=ntp"

192.168.206.130 | success >> {

"changed": false,

"item": "",

"module": "apt"

}

Ansible scripts are idempotent: can run multiple times safely

$ ansible-playbook ntp.yaml

PLAY [controller] *********************

GATHERING FACTS *********************

ok: [192.168.206.130]

TASK: [ensure ntp packages is installed] *********************

ok: [192.168.206.130]

TASK: [ensure ntp.conf file is present] *********************

ok: [192.168.206.130]

TASK: [ensure ntp service is restarted] *********************

ok: [192.168.206.130]

PLAY RECAP *********************

192.168.206.130 : ok=4 changed=1 unreachable=0 failed=0

Use handlers if action should only occur on a state change

---

- hosts: controller

tasks:

- name: ensure glance database is present

action: mysql_db name=glance

notify:

- version glance database

handlers:

- name: version glance database

action: command glance-manage version_control 0

Use templates to substitute variables in config file

keystone.conf: [DEFAULT]

public_port = 5000

admin_port = 35357

admin_token = {{ admin_token }}

keystone.yaml:

hosts: controller

vars:

admin_token: 012345SECRET99TOKEN012345

tasks:

- name: ensure keystone config script is present

action: template src=keystone.conf dest=/etc/keystone/

keystone.conf owner=root group=root mode=0644

Ansible supports multiple modules, can also do arbitrary shell commands

• apt & yum packages

• Stop/start/restart services

• users & groups

• Add SSH public keys

• MySQL & PostgreSQL users & databases

• VMs managed by libvirt

• Git checkouts

Vagrant big idea: redistributable VMs, run with config files & commands

Import a new virtual machine (Ubuntu 12.04 64-bit)

$ vagrant box add precise64

http://files.vagrantup.com/

precise64.box

Make a Vagrantfile

Vagrant::Config.run do |config|

config.vm.box = "precise64"

end

Vagrant can also generate this for you: “vagrant init precise64”

Boot it and connect to it

$ vagrant up

[default] Importing base box 'precise64'...

[default] Matching MAC address for NAT networking...

[default] Clearing any previously set forwarded ports...

[default] Fixed port collision for 22 => 2222. Now on port 2200.

[default] Forwarding ports...

[default] -- 22 => 2200 (adapter 1)

[default] Creating shared folders metadata...

[default] Clearing any previously set network interfaces...

[default] Booting VM...

[default] Waiting for VM to boot. This can take a few minutes.

[default] VM booted and ready for use!

[default] Mounting shared folders...

[default] -- v-root: /vagrant

$ vagrant ssh

Welcome to Ubuntu 12.04 LTS (GNU/Linux 3.2.0-23-generic x86_64)

* Documentation: https://help.ubuntu.com/

Welcome to your Vagrant-built virtual machine.

Last login: Thu Jun 7 00:49:30 2012 from 10.0.2.2

vagrant@precise64:~$

Boot multi-VMs: configure IPs, memory, hostname

Vagrant::Config.run do |config|

config.vm.box = "precise64”

config.vm.define :controller do |controller_config|

controller_config.vm.network :hostonly, "192.168.206.130"

controller_config.vm.host_name = "controller"

end

config.vm.define :compute1 do |compute1_config|

compute1_config.vm.network :hostonly, "192.168.206.131"

compute1_config.vm.host_name = "compute1"

compute1_config.vm.customize ["modifyvm", :id,

"--memory", 1024]

end

end

Openstack-ansible: Ansible scripts for OpenStack Compute

Links to OpenStack Install & Deploy Guide

Config: controller, one compute host, QEMU, FlatDHCP

controller compute1

eth1 eth1

eth2 eth2 eth0 eth0

NAT NAT

192.168.206.*

.130 .131

192.168.100.*

.130 .131

Vagrantfile describes this setup Vagrant::Config.run do |config|

config.vm.box = "precise64"

config.vm.define :controller do |controller_config|

controller_config.vm.network :hostonly, "192.168.206.130”

controller_config.vm.host_name = "controller"

end

config.vm.define :compute1 do |compute1_config|

compute1_config.vm.network :hostonly, "192.168.206.131”

compute1_config.vm.host_name = "compute1"

compute1_config.vm.customize ["modifyvm", :id, "--memory",

1024]

compute1_config.vm.customize ["modifyvm", :id, "--

nicpromisc3",

"allow-all"]

end

end

If all goes well… $ make all

. . .

-------------------------------------+--------------------------------------+

| Property | Value |

+-------------------------------------+--------------------------------------+

| OS-DCF:diskConfig | MANUAL |

| OS-EXT-SRV-ATTR:host | None |

| OS-EXT-SRV-ATTR:hypervisor_hostname | None |

| OS-EXT-SRV-ATTR:instance_name | instance-00000001 |

| OS-EXT-STS:power_state | 0 |

| OS-EXT-STS:task_state | scheduling |

| OS-EXT-STS:vm_state | building |

| accessIPv4 | |

| accessIPv6 | |

| adminPass | CJ8NNNa4dc6f |

| config_drive | |

| created | 2012-08-09T02:51:14Z |

| flavor | m1.tiny |

| hostId | |

| id | 8e9238b8-208d-46a8-8f66-c40660abacff |

| image | cirros-0.3.0-x86_64 |

| key_name | mykey |

| metadata | {} |

| name | cirros |

| progress | 0 |

| status | BUILD |

| tenant_id | 6f29ce771aba46f29f53e178e3b02e66 |

| updated | 2012-08-09T02:51:14Z |

| user_id | ad809727c0a748c9ad12834b6f24b3a1 |

+-------------------------------------+--------------------------------------+

Links

• Vagrantfile & Ansible playbooks for OpenStack:

http://github.com/lorin/openstack-ansible

• Ansible: http://ansible.github.com

• Vagrant: http://vagrantup.com

• Ansible playbook examples: https://github.com/ansible/ansible/tree/devel/examples/playbooks

• Vagrant boxes: http://vagrantbox.es

Heterogeneous, High-Performance Cloud

Computing using OpenStack

Dong-In Kang, Steve Crago

, John P. Walters, Mikyung Kang, Jinwoo Suh,

Jeff Burney, and Karandeep Singh University of Southern California /

Information Sciences Institute

August 9, 2012

Objectives

Heterogeneous, virtualized high performance computing (HPC)

testbed

HPC resources available through private cloud

— Resources available remotely for operations, prototypes, experiments and

disadvantaged users

— Dynamic resource provisioning

— Non-proprietary open source cloud software that can be replicated and

extended as needed

Heterogeneous processing resources

— Large x86-based shared memory machine (SGI UV100)

— General-purpose many-core (Tilera TILEmpower)

— GPU-based accelerators (NVidia Tesla)

— Architectures other than x86 (ARM, …)

32

Heterogeneous Processing

Testbed

33

Heterogeneous On-Demand

Processing Testbed

Shared Memory:

•(1) SGI UV100

HPC Cluster

Tiled Processor:

•(10) Tilera TILEmpower

Commodity Cluster and Storage

Storage Array

GPU Cluster:

•(3) Tesla S2050

• 1 SGI Altix UV 100 (Intel

Xeon Nehalem, 128 cores)

• 10 TILEmpower boards

(Tilera TILEPro64 640

cores)

• 3 Tesla 2050s (NVidia Fermi

GPUs, 5,376 cores)

• Commodity cluster (Intel

Xeon Clovertown, 80 cores)

Heterogeneous Processors

34

Processing Component

Characteristics

SGI UV 100 Shared memory, traditional HPC, x86 processors that support legacy code. Supports KVM and LXC.

Tilera TILEmpower General-purpose many-core, 10x-100x improvement in power efficiency for integer processing, Linux-based C/C++ development environment. Supports bare-metal provisioning.

Nvidia TESLA 2050 Very high performance and efficiency (100x) for regular computational kernels, CUDA development environment. Supports LXC (host).

Heterogeneity: Architectures

CPU: GPU:

 

1010 samples108 samples

136.2 seconds 139.5 seconds

SGI UV100 rendering 1926 objects

Tilera vs. x86 video transcoding

Using Heterogeneous Architecture

in OpenStack

New machine types

— Each machine type requires (or can handle) unique image type (e.g. a GPU requires a GPU executable)

— Each machine type has an image boot process

36

Machine Types:

•SGI Ultra Violet: sh1.small,

sh1.large, …

•Tilera TileEmpower:

tp64.8x8, …

•Nvidia Tesla GPU

cg1.large+s2050

Heterogeneity: Virtualization

3D parallel rendering system

— Tachyon v. 0.99

— Rendering a scene with 1926 objects

— Shared memory test

0

10

20

30

40

50

60

70

1 16 32 64

S

p

e

e

d

u

p

Number of H/W Threads Used

Speedup of 3D Rendering (Tachyon)

Native (w/o pinning)

KVM w/ pinning

LXC w/ pinning (2 times h/w threads)

LXC w/ pinning

LXC w/o pinning

Heterogeneity: GPU Access Methods

0

500

1000

1500

2000

2500

3000

3500

4000

MB

/se

c

Bytes

Host to Device Bandwidth, Pageable

Host

LXC

gVirtus

0

20

40

60

80

100

120

140

160

180

200

80x160

160x320

240x480

320x640

400x800

480x960

560x1120

640x1280

720x1440

800x1600

GFlo

ps/S

ec

Size (NxM), Single Precision Real

Matrix Multiply for Increasing NxM

Host

gVirtus

LXC

How to Support Heterogeneity in OpenStack

Scheduler — Using ‘instance_type_extra_specs’ table in the ‘nova’ DB

| 15 | cpu_arch | s== x86_64 |

| 15 | hypervisor_type | s== LXC |

| 15 | gpu_arch | s== fermi |

| 15 | gpus | = 1 |

— /etc/nova/nova.conf

instance_type_extra_specs=cpu_arch:x86_64, gpus:4, gpu_arch:fermi

— Schedules if both match

— Blueprint (Under Code Review)

https://blueprints.launchpad.net/nova/+spec/instance-type-extra-specs-extension

Baremetal Provisioning

— USC/ISI + NTT Docomo

— Blueprint (Under Code Review)

https://blueprints.launchpad.net/nova/+spec/general-bare-metal-provisioning-

framework

Future Plans

• Additional devices

• FPGAs

• Arm cores (Calxeda)

• Next-generation GPUs

• New host virtualization options with GPUs

• Collaboration with Nvidia

• Resource scheduling

• Security hardening

• Application demonstrations

• Deployment

THANK YOU FOR COMING!

Please stay tuned for the next Meet Up!

You will receive a survey & your feedback is greatly appreciated!

Follow us on…

http://twitter.com/OpenStackDC

http://meetup.com/OpenStackDC

http://linkedin.com/groups/OpenStack-DC-4207039

http://www.meetup.com/OpenStackDC/suggestion/

http://www.meetup.com/OpenStackDC/messages/boards/