Roll Base Usersguide

download Roll Base Usersguide

of 316

Transcript of Roll Base Usersguide

  • Base Users Guide

    5.4.3 Edition

  • Base Users Guide:5.4.3 EditionPublished Aug 11 2011Copyright 2011 University of California

    This document is subject to the Rocks License (see Rocks Copyright).

  • Table of ContentsPreface........................................................................................................................................................................ viii1. Overview ....................................................................................................................................................................12. Attributes ...................................................................................................................................................................23. Installing a Rocks Cluster ........................................................................................................................................5

    3.1. Getting Started................................................................................................................................................53.2. Install and Configure Your Frontend..............................................................................................................63.3. Install Your Compute Nodes ........................................................................................................................193.4. Upgrade or Reconfigure Your Existing Frontend.........................................................................................233.5. Installing a Frontend over the Network........................................................................................................253.6. Enabling Public Web Access to Your Frontend ...........................................................................................26

    4. Defining and Modifying Networks and Network Interfaces ...............................................................................294.1. Networks, Subnets, VLANs and Interfaces..................................................................................................294.2. Named Networks/Subnets ............................................................................................................................294.3. Host Interfaces..............................................................................................................................................294.4. Virtual LANs (VLANs) and Logical VLAN Bridges ..................................................................................304.5. Network Bridging for Virtual Machines ......................................................................................................334.6. Networking Configuration Examples...........................................................................................................35

    5. Customizing your Rocks Installation ....................................................................................................................375.1. Adding Packages to Compute Nodes ...........................................................................................................375.2. Customizing Configuration of Compute Nodes ...........................................................................................385.3. Adding Applications to Compute Nodes .....................................................................................................385.4. Configuring Additional Ethernet Interfaces .................................................................................................395.5. Compute Node Disk Partitioning .................................................................................................................405.6. Creating a Custom Kernel RPM...................................................................................................................495.7. Enabling RSH on Compute Nodes...............................................................................................................515.8. Adding a New Appliance Type to the Cluster..............................................................................................515.9. Adding a Device Driver................................................................................................................................535.10. Extending DNS ..........................................................................................................................................555.11. Changing the Root password......................................................................................................................56

    6. Community Resources............................................................................................................................................586.1. Access to Rocks Source Code......................................................................................................................586.2. All Past Rocks Releases ...............................................................................................................................586.3. Email Discussion List...................................................................................................................................586.4. Office Hours .................................................................................................................................................58

    7. Administration Examples.......................................................................................................................................607.1. Introduction to the Rocks Command Line ...................................................................................................607.2. Boot Order and PXE First ............................................................................................................................607.3. Support for PXE First...................................................................................................................................617.4. Forcing a Re-install at Next PXE Boot ........................................................................................................617.5. Inspecting and Changing PXE Behaviour....................................................................................................617.6. Working with and Modifying Network Configuration.................................................................................637.7. Reinstall All Compute Nodes with SGE......................................................................................................64

    iii

  • 8. Advanced Tasks.......................................................................................................................................................658.1. Managing the Firewall on the Cluster Nodes ...............................................................................................658.2. Flashing BIOS on Compute Nodes Using PXE...........................................................................................698.3. Adding a Login Appliance to Your Cluster..................................................................................................708.4. Channel Bonding Interfaces .........................................................................................................................718.5. Frontend Central Server ...............................................................................................................................728.6. Cross Kickstarting ........................................................................................................................................738.7. Adding Kernel Boot Parameters...................................................................................................................75

    9. Command Reference ..............................................................................................................................................789.1. add ................................................................................................................................................................789.2. config............................................................................................................................................................969.3. create ............................................................................................................................................................979.4. disable.........................................................................................................................................................1019.5. dump...........................................................................................................................................................1029.6. enable .........................................................................................................................................................1079.7. help .............................................................................................................................................................1089.8. iterate ..........................................................................................................................................................1099.9. list ...............................................................................................................................................................1099.10. remove ......................................................................................................................................................1259.11. report ........................................................................................................................................................1409.12. run.............................................................................................................................................................1479.13. set .............................................................................................................................................................1499.14. swap..........................................................................................................................................................1769.15. sync...........................................................................................................................................................1779.16. update .......................................................................................................................................................179

    A. Frequently Asked Questions ...............................................................................................................................180A.1. Installation .................................................................................................................................................180A.2. Configuration.............................................................................................................................................182A.3. System Administration ..............................................................................................................................184

    B. Release Notes ........................................................................................................................................................186B.1. Release 5.4.3 - changes from 5.4...............................................................................................................186B.2. Release 5.4 - changes from 5.3..................................................................................................................190B.3. Release 5.3 - changes from 5.2..................................................................................................................194B.4. Release 5.2 - changes from 5.1..................................................................................................................196B.5. Release 5.1 - changes from 5.0..................................................................................................................199B.6. Release 4.3 - changes from 4.2.1...............................................................................................................200B.7. Release 3.2.0 - changes from 3.1.0............................................................................................................201B.8. Release 3.2.0 - changes from 3.1.0............................................................................................................202B.9. Release 3.1.0 - changes from 3.0.0............................................................................................................203B.10. Release 3.0.0 - changes from 2.3.2..........................................................................................................204B.11. Release 2.3.2 - changes from 2.3.1..........................................................................................................205B.12. Release 2.3.1 - changes from 2.3.............................................................................................................205B.13. Release 2.2.1 - changes from 2.2.............................................................................................................206B.14. Release 2.2 - changes from 2.1.2.............................................................................................................206B.15. Release 2.1.2 - changes from 2.1.1..........................................................................................................207B.16. Release 2.1.1 - changes from 2.1.............................................................................................................207

    iv

  • B.17. Release 2.1 - changes from 2.0.1.............................................................................................................208B.18. Release 2.0.1 - changes from 2.0.............................................................................................................209

    C. 411 Secure Information Service Internals .........................................................................................................210C.1. Using the 411 Service................................................................................................................................210C.2. Structure.....................................................................................................................................................210C.3. 411 Groups ................................................................................................................................................211C.4. Plugins .......................................................................................................................................................212C.5. 411get Configuration File ..........................................................................................................................213C.6. Commands .................................................................................................................................................214

    D. Changes to Rocks Security Infrastructure ........................................................................................................216D.1. Rocks Password Infrastructure ..................................................................................................................216D.2. Rocks Secure Attribute Infrastructure .......................................................................................................217

    E. Kickstart Nodes Reference ..................................................................................................................................220E.1. Rocks Base Nodes .....................................................................................................................................220

    F. Rocks Copyright and Trademark ....................................................................................................................235F.1. Copyright Statement...................................................................................................................................235F.2. Trademark Licensing..................................................................................................................................236

    G. Common Licenses ................................................................................................................................................237G.1. Artistic License..........................................................................................................................................237G.2. Apache v2.0...............................................................................................................................................239G.3. GNU General Public License v1 ...............................................................................................................243G.4. GNU General Public License v2 ...............................................................................................................247G.5. GNU Lesser General Public License v2.1.................................................................................................254G.6. GNU Library General Public License v2 ..................................................................................................264G.7. Python Software Foundation License v2...................................................................................................273

    H. Package Licenses..................................................................................................................................................275H.1. anaconda....................................................................................................................................................275H.2. ant ..............................................................................................................................................................275H.3. coreutils .....................................................................................................................................................278H.4. cvs..............................................................................................................................................................278H.5. eGenix mx .................................................................................................................................................278H.6. FireFox ......................................................................................................................................................280H.7. gawk ..........................................................................................................................................................290H.8. gd ...............................................................................................................................................................290H.9. graphviz .....................................................................................................................................................291H.10. kudzu .......................................................................................................................................................296H.11. libxml2.....................................................................................................................................................296H.12. libxml2doc...............................................................................................................................................296H.13. mysql .......................................................................................................................................................297H.14. ncurses .....................................................................................................................................................299H.15. numarray..................................................................................................................................................300H.16. Numeric ...................................................................................................................................................301H.17. perl...........................................................................................................................................................302H.18. perl tk.......................................................................................................................................................303H.19. pexpect.....................................................................................................................................................305H.20. phpMyAdmin ..........................................................................................................................................305

    v

  • H.21. POW ........................................................................................................................................................305H.22. pygtk ........................................................................................................................................................306H.23. python......................................................................................................................................................306H.24. rcs ............................................................................................................................................................306H.25. readline ....................................................................................................................................................307H.26. tidy...........................................................................................................................................................307H.27. wget .........................................................................................................................................................307

    vi

  • List of Tables1-1. Summary..................................................................................................................................................................11-2. Compatibility ...........................................................................................................................................................12-1. Roll Attributes .........................................................................................................................................................23-1. Frontend -- Default Root Disk Partition ................................................................................................................165-1. Compute Node -- Default Root Disk Partition ......................................................................................................405-2. A Compute Node with 3 SCSI Drives...................................................................................................................46

    vii

  • PrefaceSince May 2000, the Rocks group has been addressing the difficulties of deploying manageable clusters. We havebeen driven by one goal: make clusters easy. By easy we mean easy to deploy, manage, upgrade and scale. We aredriven by this goal to help deliver the computational power of clusters to a wide range of scientific users. It is clearthat making stable and manageable parallel computing platforms available to a wide range of scientists will aidimmensely in improving the state of the art in parallel tools.

    viii

  • Chapter 1. Overview

    Table 1-1. Summary

    Name base

    Version 5.4.3

    Maintained By Rocks Group

    Architecture i386, x86_64

    Compatible with Rocks 5.4.3

    The base roll has the following requirements of other rolls. Compatability with all known rolls is assured, and allknown conflicts are listed. There is no assurance of compatiblity with third-party rolls.

    Table 1-2. Compatibility

    Requires ConflictsKernelOSService Pack

    This roll has been released independent of the corresponding Rocks release. It therefore requires thecomplete OS roll and will not function correctly if using only the Jumbo or incomplete set of OS CDROMs.

    1

  • Chapter 2. Attributes

    Table 2-1. Roll Attributes

    Name Type DefaultdisableServices string kudzu canna cWnn FreeWnn kWnn

    tWnn mDNSResponder

    Info_CertificateCountry a string

    Info_CertificateLocality a string

    Info_CertificateOrganization a string

    Info_CertificateState a string

    Info_CertificateContact a string

    Info_CertificateLatLong a string

    Info_CertificateName a string

    Info_CertificateURL a string

    Kickstart_DistroDir a string /export/rocks

    Kickstart_Keyboard a string us

    Kickstart_Lang a string en_US

    Kickstart_Langsupport a string en_US

    Kickstart_Mutlicast a string 226.117.172.185

    Kickstart_PrivateAddress a string 10.1.1.1

    Kickstart_PrivateBroadcast a string 10.1.255.255

    Kickstart_PrivateDNSDomain a string local

    Kickstart_PrivateDNSServers a string 10.1.1.1

    Kickstart_PrivateGateway a string 10.1.1.1

    Kickstart_PrivateHostname a string

    Kickstart_PrivateKickstartBaseDir a string install

    Kickstart_PrivateKickstartCGI a string sbin/kickstart.cgi

    Kickstart_PrivateKickstartHost a string 10.1.1.1

    Kickstart_PrivateNTPHost a string 10.1.1.1

    Kickstart_PrivateNetmask a string 255.255.0.0

    Kickstart_PrivateNetmaskCIDR a string 16

    Kickstart_PrivateNetwork a string 10.1.0.0

    Kickstart_PrivatePortableRootPassword a string

    Kickstart_PrivateRootPassword a string

    Kickstart_PrivateSHARootPassword a string

    Kickstart_PrivateSyslogHost a string 10.1.1.1

    Kickstart_PublicAddress a string

    Kickstart_PublicBroadcast a string

    2

  • Chapter 2. Attributes

    Name Type DefaultKickstart_PublicDNSDomain a string

    Kickstart_PublicDNSServers a string

    Kickstart_PublicGateway a string

    Kickstart_PublicHostname a string

    Kickstart_PublicKickstartHost a string

    Kickstart_PublicNTPHost a string

    Kickstart_PublicNetmask a string

    Kickstart_PublicNetmaskCIDR a string

    Kickstart_PublicNetwork a string

    Kickstart_Timezone a string

    airboss b string specified on boot line

    arch c, b string i386 | x86_64

    dhcp_filename d string pxelinux.0

    dhcp_nextserver d string 10.1.1.1

    hostname e, b string

    kickstartable d bool TRUE

    os c, b string linux | solaris

    rack e, b int

    rank e, b int

    rocks_version a string 5.4.3

    rsh f bool FALSE

    ssh_use_dns a bool TRUE

    x11 f bool FALSE

    Notes:a. Default value created using rocks add attr name value and affects all hosts.b. Default value created using rocks add host attr localhost name value and only affects the frontendappliance.c. Attribute is for internal use only, and should not be altered by the user. Each time a machine installs thisattributed is reset to the default value for that machine (depend on kernel booted).d. Default value created using rocks add appliance attr appliance name value for the frontend andcompute appliances.e. Attribute cannot by modified. This value is not recorded in the cluster database and is only available as an XMLentity during installation.f. Attribute is referenced but not defined so is treated as FALSE.

    Info_Certificate_{*}

    The attributes are created during frontend installation. The values are taken from user input on the systeminstallation screens.

    Kickstart_{*}

    The attributes are created during frontend installation. The values are taken from user input on the system

    3

  • Chapter 2. Attributes

    installation screens. All of these attributes are considered internal to Rocks and should not be modifieddirectly.

    airboss

    Specifies the address of the airboss host. This only applies to virtual machines.

    arch

    The CPU architecture of the host. This host-specific attribute is set by the installing machine. User changes tothis attribute have no affect.

    dhcp_filename

    Name of the PXE file retrieved over TFTP at startup.

    dhcp_nextserver

    IP address of the server that servers installation profiles (kickstart, jumpstart). In almost all configuration thisshould be the frontend machine.

    kickstartable

    The attribute must be set to TRUE for all appliances, and FALSE (or undefined) for all unmanaged devices (e.g.network switches).

    os

    The OS of the host. This host-specific attribute is set by the installing machine. User changes to this attributehave no affect.

    rsh

    If TRUE the machine is configured as an RSH client. This is not recommended, and will still require RSH serverconfiguration on the frontend machine.

    ssh_use_dns

    Set to FALSE to disable DNS lookups when connecting to nodes in the cluster over SSH. If establishing an sshconnect is slow the cause may be a faulty (or absent) DNS system. Disabling this lookup will speed upconnection establishment, but lowers the security of your system.

    x11

    If TRUE X11 is configured and the default runlevel is changed from 3 to 5. X11 is always configure on thefrontend and this attribute applies only to the other nodes in the cluster.

    4

  • Chapter 3. Installing a Rocks Cluster

    3.1. Getting StartedThis chapter describes the steps to build your cluster and install its software.

    3.1.1. Supported HardwareProcessors

    x86 (ia32, AMD Athlon, etc.)

    x86_64 (AMD Opteron and EM64T)

    Networks

    Ethernet

    Specialized networks and components (e.g., Myrinet, Infiniband, nVidia GPU) are also supported. Hardwarerequirements and software (Rocks Rolls) can be found on the respective vendor web sites.

    3.1.2. Minimum Hardware RequirementsFrontend Node

    Disk Capacity: 30 GB

    Memory Capacity: 1 GB

    Ethernet: 2 physical ports (e.g., "eth0" and "eth1")

    BIOS Boot Order: CD, Hard Disk

    Compute Node

    Disk Capacity: 30 GB

    Memory Capacity: 1 GB

    Ethernet: 1 physical port (e.g., "eth0")

    BIOS Boot Order: CD, PXE (Network Boot), Hard Disk

    5

  • Chapter 3. Installing a Rocks Cluster

    3.1.3. Physical AssemblyThe first thing to manage is the physical deployment of a cluster. Much research exists on the topic of how tophysically construct a cluster. A majority of the OReilly Book1 Building Linux Clusters is devoted to the physicalsetup of a cluster, how to choose a motherboard, etc. Finally, the book How to Build a Beowulf also has some goodtips on physical construction.

    We favor rack-mounted equipment because of its relative reliability and density. There are Rocks clusters, however,that are built from mini-towers. Choose what makes sense for you.

    The following diagram shows how the frontend and compute nodes must be connected:

    On the compute nodes, the Ethernet interface that Linux maps to eth0 should be connected to the clusters Ethernetswitch. This network is considered private, that is, all traffic on this network is physically separated from the externalpublic network (e.g., the internet).

    On the frontend, at least two ethernet interfaces are required. The interface that Linux maps to eth0 should beconnected to the same ethernet network as the compute nodes. The interface that Linux maps to eth1 should beconnected to the external network (e.g., the internet or your organizations intranet).

    3.2. Install and Configure Your FrontendThis section describes how to install your Rocks cluster frontend.

    The minimum requirement to bring up a frontend is to have the following rolls:

    Kernel/Boot Roll CD

    Base Roll CD

    Web Server Roll CD

    6

  • Chapter 3. Installing a Rocks Cluster

    OS Roll CD - Disk 1

    OS Roll CD - Disk 2

    The Core Meta Roll CD can be substituted for the individual Base and Web-Server Rolls.

    Additionally, the official Red Hat Enterprise Linux 5 update 4 CDs can be substituted for the OS Rolls. Also, anytrue rebuild of RHEL 5 update 4 can be used -- distributions known to work are: CentOS 5 update 4 andScientific Linux 5 update 4. If you substitute the OS Rolls with one of the above distributions, you must supply allthe CDs from the distribution (which usually is 6 or 7 CDs).

    1. Insert the Kernel/Boot Roll CD into your frontend machine and reset the frontend machine.

    For the remainder of this section, well use the example of installing a bare-bones frontend, that is, well beusing the Kernel/Boot Roll, Core Roll, OS - Disk 1 Roll and the OS - Disk 2 Roll.

    2. After the frontend boots off the CD, you will see:

    When you see the screen above, type:

    build

    The boot: prompt arrives and departs the screen quickly. It is easy to miss. If you do miss it, the nodewill assume it is a compute appliance, and the frontend installation will fail and you will have to restart theinstallation (by rebooting the node).

    7

  • Chapter 3. Installing a Rocks Cluster

    If the installation fails, very often you will see a screen that complains of a missing /tmp/ks.cfg kickstartfile. To get more information about the failure, access the kickstart and system log by pressing Ctrl-Alt-F3and Ctrl-Alt-F4 respectively.

    After you type build, the installer will start running.

    3.

    All screens in this step may not appear during your installation. You will only see these screens if there isnot a DHCP server on your public network that answers the frontends DHCP request.

    If you see the screen below:

    Youll want to: 1) enable IPv4 support, 2) select manual configuration for the IPv4 support (no DHCP) and, 3)disable IPv6 support. The screen should look like:

    After your screen looks like the above, hit "OK". Then youll see the "Manual TCP/IP Configuration" screen:

    8

  • Chapter 3. Installing a Rocks Cluster

    In this screen, enter the public IP configuration. Heres an example of the public IP info we entered for one ourfrontends:

    After you fill in the public IP info, hit "OK".

    4. Soon, youll see a screen that looks like:

    From this screen, youll select your rolls.

    In this procedure, well only be using CD media, so well only be clicking on the CD/DVD-based Roll button.

    Click the CD/DVD-based Roll button.

    5. The CD will eject and you will see this screen:

    9

  • Chapter 3. Installing a Rocks Cluster

    Put your first roll in the CD tray (for the first roll, since the Kernel/Boot Roll is already in the tray, simply pushthe tray back in).

    Click the Continue button.

    6. The Kernel/Boot Roll will be discovered and display the screen:

    10

  • Chapter 3. Installing a Rocks Cluster

    Select the Kernel/Boot Roll by checking the Selected box and clicking the Submit button.

    7. This screen shows you have properly selected the Kernel/Boot Roll.

    Repeat steps 3-5 for the Base Roll, Web Server Roll and the OS rolls.

    8. When you have selected all the rolls associated with a bare-bones frontend, the screen should look like:

    11

  • Chapter 3. Installing a Rocks Cluster

    When you are done with roll selection, click the Next button.

    9. Then youll see the Cluster Information screen:

    12

  • Chapter 3. Installing a Rocks Cluster

    The one important field in this screen is the Fully-Qualified Host Name (all other fields are optional).

    Choose your hostname carefully. The hostname is written to dozens of files on both the frontend andcompute nodes. If the hostname is changed after the frontend is installed, several cluster services will nolonger be able to find the frontend machine. Some of these services include: SGE, NFS, AutoFS, andApache.

    Fill out the form, then click the Next button.

    10. The private cluster network configuration screen allows you to set up the networking parameters for the ethernetnetwork that connects the frontend to the compute nodes.

    It is recommended that you accept the defaults (by clicking the Next button). But for those who haveunique circumstances that requires different values for the internal ethernet connection, we have exposedthe network configuration parameters.

    11. The public cluster network configuration screen allows you to set up the networking parameters for the ethernetnetwork that connects the frontend to the outside network (e.g., the internet).

    13

  • Chapter 3. Installing a Rocks Cluster

    The above window is an example of how we configured the external network on one of our frontend machines.

    12. Configure the the Gateway and DNS entries:

    13. Input the root password:

    14

  • Chapter 3. Installing a Rocks Cluster

    14. Configure the time:

    15. The disk partitioning screen allows you to select automatic or manual partitioning.

    15

  • Chapter 3. Installing a Rocks Cluster

    To select automatic partitioning, click the Auto Partitioning radio button. This will repartition and reformatthe first discovered hard drive that is connected to the frontend. All other drives connected to the frontend will beleft untouched.

    The first discovered drive will be partitioned like:

    Table 3-1. Frontend -- Default Root Disk Partition

    Partition Name Size/ 16 GB

    /var 4 GB

    swap 1 GB

    /export (symbolically linked to /state/partition1) remainder of root disk

    When you use automatic partitioning, the installer will repartition and reformat the first hard drive that theinstaller discovers. All previous data on this drive will be erased. All other drives will be left untouched.

    The drive discovery process uses the output of cat /proc/partitions to get the list of drives.

    For example, if the node has an IDE drive (e.g., "hda") and a SCSI drive (e.g., "sda"), generally the IDEdrive is the first drive discovered.

    But, there are instances when a drive you dont expect is the first discovered drive (weve seen this withcertain fibre channel connected drives). If you are unsure on how the drives will be discovered in amulti-disk frontend, then use manual partitioning.

    16

  • Chapter 3. Installing a Rocks Cluster

    16. If you selected manual partitioning, then you will now see Red Hats manual partitioning screen:

    Above is an example of creating a /, /var, swap and /export partitions.

    If you select manual partitioning, you must specify at least 16 GBs for the root partition and you mustcreate a separate /export partition.

    LVM is not supported by Rocks.

    When you finish describing your partitions, click the Next button.

    17. The frontend will format its file systems, then it will ask for each of the roll CDs you added at the beginning ofthe frontend installation.

    17

  • Chapter 3. Installing a Rocks Cluster

    In the example screen above, insert the Kernel/Boot Roll into the CD tray and click OK.

    The contents of the CD will now be copied to the frontends hard disk.

    Repeat this step for each roll you supplied in steps 3-5.

    After all the Rolls are copied, no more user interaction is required.

    18. After the last roll CD is copied, the packages will be installed:

    18

  • Chapter 3. Installing a Rocks Cluster

    19. Finally, the boot loader will be installed and post configuration scripts will be run in the background. When theycomplete, the frontend will reboot.

    3.3. Install Your Compute Nodes

    1. Login to the frontend node as root.

    2. Run the program which captures compute node DHCP requests and puts their information into the RocksMySQL database:

    # insert-ethers

    This presents a screen that looks like:

    19

  • Chapter 3. Installing a Rocks Cluster

    If your frontend and compute nodes are connected via a managed ethernet switch, youll want to selectEthernet Switches from the list above. This is because the default behavior of many managed ethernetswitches is to issue DHCP requests in order to receive an IP address that clients can use to configure andmonitor the switch.

    When insert-ethers captures the DHCP request for the managed switch, it will configure it as an ethernetswitch and store that information in the MySQL database on the frontend.

    As a side note, you may have to wait several minutes before the ethernet switch broadcasts its DHCPrequest. If after 10 minutes (or if insert-ethers has correctly detected and configured the ethernet switch),then you should quit insert-ethers by hitting the F8 key.

    Now, restart insert-ethers and continue reading below to configure your compute nodes.

    Take the default selection, Compute, hit Ok.

    3. Then youll see:

    20

  • Chapter 3. Installing a Rocks Cluster

    This indicates that insert-ethers is waiting for new compute nodes.

    4. Power up the first compute node.

    The BIOS boot order of your compute nodes should be: CD, PXE (Network Boot), Hard Disk.

    If your compute nodes dont support PXE, then youll need to boot your compute nodes with the Kernel RollCD.

    If you dont have a CD drive in your compute nodes and if the network adapters in your compute nodes dontsupport PXE, see Using a Floppy to PXE boot.

    5. When the frontend machine receives the DHCP request from the compute node, you will see something similarto:

    This indicates that insert-ethers received the DHCP request from the compute node, inserted it into thedatabase and updated all configuration files (e.g., /etc/hosts, /etc/dhcpd.conf and DNS).

    The above screen will be displayed for a few seconds and then youll see the following:

    21

  • Chapter 3. Installing a Rocks Cluster

    In the above image, insert-ethers has discovered a compute node. The "( )" next to compute-0-0 indicates thenode has not yet requested a kickstart file. You will see this type of output for each compute node that issuccessfully identified by insert-ethers.

    Figure: The compute node has successfully requested a kickstart file from the frontend. If there are no more computenodes, you may now quit insert-ethers. Kickstart files are retrieved via HTTPS. If there was an error during thetransmission, the error code will be visible instead of "*".

    6. At this point, you can monitor the installation by using rocks-console. Just extract the name of the installingcompute node from the insert-ethers output (in the example above, the compute node name iscompute-0-0), and execute:

    # rocks-console compute-0-0

    7. After youve installed all the compute nodes in a cabinet, quit insert-ethers by hitting the F8 key.

    8. After youve installed all the compute nodes in the first cabinet and you wish to install the compute nodes in thenext cabinet, just start insert-ethers like:

    22

  • Chapter 3. Installing a Rocks Cluster

    # insert-ethers --cabinet=1

    This will name all new compute nodes like compute-1-0, compute-1-1, ...

    3.4. Upgrade or Reconfigure Your Existing FrontendThis procedure describes how to use a Restore Roll to upgrade or reconfigure your existing Rocks cluster.

    Lets create a Restore Roll for your frontend. This roll will contain site-specific info that will be used to quicklyreconfigure your frontend (see the section below for details).

    # cd /export/site-roll/rocks/src/roll/restore# make roll

    The above command will output a roll ISO image that has the name of the form:hostname-restore-date-0.arch.disk1.iso. For example, on the i386-based frontend with the FQDN ofrocks-45.sdsc.edu, the roll will be named like:

    rocks-45.sdsc.edu-restore-2006.07.24-0.i386.disk1.iso

    Burn your restore roll ISO image to a CD.

    Reinstall the frontend by putting the Rocks Boot CD in the CD tray (generally, this is the Kernel/Boot Roll) andreboot the frontend.

    At the boot: prompt type:

    build

    At this point, the installation follows the same steps as a normal frontend installation (See the section: InstallFrontend) -- with two exceptions:

    1. On the first user-input screen (the screen that asks for local and network rolls), be sure to supply the RestoreRoll that you just created.

    2. You will be forced to manually partition your frontends root disk.

    You must reformat your / partition, your /var partition and your /boot partition (if it exists).

    Also, be sure to assign the mountpoint of /export to the partition that contains the users home areas. DoNOT erase or format this partition, or you will lose the user home directories. Generally, this is the largestpartition on the first disk.

    After your frontend completes its installation, the last step is to force a re-installation of all of your compute nodes.The following will force a PXE (network install) reboot of all your compute nodes.

    # ssh-agent $SHELL# ssh-add# rocks run host compute /boot/kickstart/cluster-kickstart-pxe

    23

  • Chapter 3. Installing a Rocks Cluster

    3.4.1. Restore Roll InternalsBy default, the Restore Roll contains two sets of files: system files and user files, and some user scripts. The systemfiles are listed in the FILES directive in the file:/export/site-roll/rocks/src/roll/restore/src/system-files/version.mk.

    FILES = /etc/passwd /etc/shadow /etc/gshadow /etc/group \/etc/exports /etc/auto.home /etc/motd

    The user files are listed in the FILES directive in the file:/export/site-roll/rocks/src/roll/restore/version.mk.

    FILES += /etc/X11/xorg.conf

    If you have other files youd like saved and restored, then append them to the FILES directive in the file/export/site-roll/rocks/src/roll/restore/version.mk, then rebuild the restore roll.

    If youd like to add your own post sections, you can add the name of the script to the SCRIPTS directive of the/export/site-roll/rocks/src/roll/restore/version.mk file.

    SCRIPTS += /share/apps/myscript.sh /share/apps/myscript2.py

    This will add the shell script /share/apps/myscript.sh, and the python script /share/apps/myscript2.pyin the post section of the restore-user-files.xml file.

    If youd like to run the script in "nochroot" mode, add

    # nochroot

    as the first comment in your script file after the interpreter line, if one is present.

    For example

    #!/bin/bash#nochrootecho "This is myscript.sh"

    or

    #nochrootecho "This is myscript.sh"

    will run the above code in the "nochroot" mode during installation. As opposed to

    echo "This is myscript.sh"#nochroot

    or

    #!/bin/bashecho "This is myscript.sh"

    will NOT run the script under "nochroot" mode.

    24

  • Chapter 3. Installing a Rocks Cluster

    All the files under /export/rocks/install/site-profiles are saved and restored. So, any user modificationsthat are added via the XML node method will be preserved.

    The networking info for all node interfaces (e.g., the frontend, compute nodes, NAS appliances, etc.) are saved andrestored. This is accomplished via the rocks dump command.

    3.5. Installing a Frontend over the NetworkThis section describes installing a Rocks frontend from a "Central" server over the wide area network, a processcalled WAN kickstart. The client frontend will retrieve Rocks Rolls and configuration over the Internet, and use theseto install itself.

    1. First, boot the node that will be your new frontend with the Kernel/Boot Roll CD (see steps 1 and 2 in thesection "Install Frontend").

    2. Then youll see the screen as described in step 3 in the section "Install Frontend". Enter the FQDN of yourcentral server in the Hostname of Roll Server text box (dont change this value if you want to use the defaultcentral server) then and click the Download button.

    Youll see a screen that lists all the rolls available on the central server. Heres an example:

    3. Now, select the rolls from the central server. To select a roll, click the checkbox next to roll. For example, thisscreen shows the area51, base, bio and viz rolls selected:

    25

  • Chapter 3. Installing a Rocks Cluster

    Click the Submit button to continue.

    4. Now youll see a screen similar to the screen below. This screen indicates that the area51, base, bio and viz rollshave been selected.

    5. To select more rolls from another server, go to step 1 and enter a different FQDN.

    6. If youd like to include CD-based rolls with your Network-based rolls, click the CD/DVD-based Roll button andfollow the instructions in the section "Install Frontend" starting at step 4.

    7. When you are finished installing CD-based rolls, you will enter into the familiar Rocks installation windows.These may change depending on what rolls you have selected. Again the section "Install Frontend" has detailsfor this process.

    8. The installer will then retrieve the chosen rolls, rebuild the distribution with all rolls included, then install thepackages. Finally, the installer will proceed with the post-section and other elements of a standard frontendinstall.

    Your frontend should now be installed and ready to initialize compute nodes (see section Install Compute Nodes).

    26

  • Chapter 3. Installing a Rocks Cluster

    3.6. Enabling Public Web Access to Your FrontendTo permenantly enable selected web access to the cluster from other machines on the public network, follow thesteps below. Apaches access control directives will provide protection for the most sensitive parts of the cluster website, however some effort will be necessary to make effective use of them.

    HTTP (web access protocol) is a clear-text channel into your cluster. Although the Apache webserver ismature and well tested, security holes in the PHP engine have been found and exploited. Opening web accessto the outside world by following the instructions below will make your cluster more prone to malicious attacksand breakins.

    To open port 80 (the www service) for the public network of frontend, execute:

    # rocks open host firewall localhost network=public protocol=tcp service=www

    Then we can see the what the resulting firewall rules will look like:

    # rocks report host firewall localhost

    *nat-A POSTROUTING -o eth1 -j MASQUERADECOMMIT

    *filter:INPUT ACCEPT [0:0]:FORWARD DROP [0:0]:OUTPUT ACCEPT [0:0]-A INPUT -i lo -j ACCEPT-A FORWARD -i eth0 -j ACCEPT-A FORWARD -i eth1 -o eth0 -m state --state RELATED,ESTABLISHED -j ACCEPT-A INPUT -i eth0 -j ACCEPT-A INPUT -i eth1 -m state --state RELATED,ESTABLISHED -j ACCEPT-A INPUT -i eth1 -p tcp --dport https -m state --state NEW --source &Kickstart_PublicNetwork;/&Kickstart_PublicNetmask; -j ACCEPT-A INPUT -i eth1 -p tcp --dport ssh -m state --state NEW -j ACCEPT-A INPUT -i eth1 -p tcp --dport www -m state --state NEW --source &Kickstart_PublicNetwork;/&Kickstart_PublicNetmask; -j ACCEPT-A INPUT -i eth1 -p tcp --dport www -j ACCEPT# block mysql traffic from non-private interfaces-A INPUT -p tcp --dport 3306 -j REJECT# block foundation mysql traffic from non-private interfaces-A INPUT -p tcp --dport 40000 -j REJECT# block ganglia traffic from non-private interfaces-A INPUT -p udp --dport 8649 -j REJECT-A INPUT -p tcp --dport 0:1024 -j REJECT-A INPUT -p udp --dport 0:1024 -j REJECTCOMMIT

    In the above example, eth0 is associated with the private network and eth1 is associated with the public network.

    27

  • Chapter 3. Installing a Rocks Cluster

    Notice the line: "-A INPUT -i eth1 -p tcp --dport www -j ACCEPT". This is the line in the firewall configuration thatwill allow web traffic from any source to flow in and out of the frontend. This line was added to your firewallconfiguration with the "rocks open host firewall" command that you executed.

    Also, notice the line: "-A INPUT -i eth1 -p tcp --dport www -m state --state NEW --source&Kickstart_PublicNetwork;/&Kickstart_PublicNetmask; -j ACCEPT". This default Rocks firewall rule allows webtraffic from your local public subnet to flow in and out of the frontend.

    Now apply the configuration to the host:

    # rocks sync host firewall localhost

    The host will now accept web traffic on its public interface.

    Test your changes by pointing a web browser to http://my.cluster.org/, where "my.cluster.org" is the DNSname of your frontend machine.

    If you cannot connect to this address, the problem is most likely in your network connectivity between your webbrowser and the cluster. Check that you can ping the frontend machine from the machine running the webbrowser, that you can ssh into it, etc.

    Notes1. http://www.oreilly.com

    28

  • Chapter 4. Defining and Modifying Networksand Network Interfaces

    4.1. Networks, Subnets, VLANs and InterfacesRocks uses a SQL database to hold information about nodes including network device information. In version 5.1support was added for VLAN tagged interfaces to enable construction of virtual clusters and other more complicatednetwork scenarios. There are a large number of commands that allow manipulation of subnet definitions, physicalinterfaces, and logical VLAN interfaces.

    The basic model of operation is for an administrator to use a series of commands to add and set/changenetworking definitions in the database and then either re-install a node or reconfigure/restart the networkconfiguration by calling rocks sync config .

    4.2. Named Networks/SubnetsRocks clusters are required to have two subnets defined: "public" and "private", but a cluster owner can define moresubnets. The commmand rocks list network lists the defined networks

    [root@rocks ~]# rocks list networkNETWORK SUBNET NETMASKprivate: 172.16.254.0 255.255.255.0public: 132.239.8.0 255.255.255.0optiputer: 67.58.32.0 255.255.224.0

    In the screen above, the additional network called "optiputer" is defined with netmask 255.255.224.0(/19). To add anetwork called "fast" as 192.168.1.0 and netmask 255.255.255.0(/24) do the following

    [root@rocks ~]# rocks add network fast subnet=192.168.1.0 netmask=255.255.255.0[root@rocks ~]# rocks list networkNETWORK SUBNET NETMASKprivate: 172.16.254.0 255.255.255.0public: 132.239.8.0 255.255.255.0optiputer: 67.58.32.0 255.255.224.0fast: 192.168.1.0 255.255.255.0

    The subnet and netmask of an existing network can be changed using rocks set network subnet and rocksset network netmask commands.

    29

  • Chapter 4. Defining and Modifying Networks and Network Interfaces

    4.3. Host InterfacesThere are three types of interfaces that a cluster owner may need to be concerned about: physical, logical, and VLAN(virtual LAN) bridges. Linux (and other OSes like Solaris) support logical interfaces that share a particular physicalnetwork port. The following shows physical network devices and associations of those devices to a named network(or subnet, used interchangably in this discussion). In the figures below, the / notation is a standard method ofhow to specify the number of bits in the netmask. Examples include: /24=255.255.255.0 (Class C subnet),/16=255.255.0.0 (Class B subnet), /8=255.0.0.0 (Class A subnet) and /25=255.255.255.128

    FIGURE: Hosts can have any number of physical networking devices. Every Rocks node must have a privatenetwork defined (e.g., eth0). Frontends also must have a separate public network (e.g., eth1). Other devices could bemyri0 (for Myrinet) or ib0 (for Infiniband).

    Adding a new network interface to a host can be done from the command line. For example, to add an interfacenamed "myri0" with IP address 192.168.1.10 on the logical subnet "fast":

    [root@rocks ~]# rocks add host interface compute-0-0-1 iface=myri0 subnet=fast ip=192.168.1.10[root@rocks ~]# rocks list host interface compute-0-0-1SUBNET IFACE MAC IP NETMASK MODULE NAME VLANprivate eth0 00:16:3e:00:00:11 172.16.254.192 255.255.255.0 xennet compute-0-0-1 ------fast myri0 ----------------- 192.168.1.10 255.255.255.0 ------ ------------- ------

    You can also set other fields for a host interface (if the field is one of [mac, ip, module, name, vlan]) with thecommand rocks set host interface iface= value. To set the nameassociated with the myri0 interface to compute-myri-0-0-1 on the node compute-0-0-1, execute:

    [root@rocks ~]# rocks set host interface name compute-0-0-1 iface=myri0 compute-myri-0-0-1[root@rocks ~]# rocks list host interface compute-0-0-1SUBNET IFACE MAC IP NETMASK MODULE NAME VLANprivate eth0 00:16:3e:00:00:11 172.16.254.192 255.255.255.0 xennet compute-0-0-1 ------fast myri0 ----------------- 192.168.1.10 255.255.255.0 ------ compute-myri-0-0-1 ------

    4.4. Virtual LANs (VLANs) and Logical VLAN BridgesLinux supports VLAN tagging on virtual interfaces (i.e., IEEE 802.1Q). For example, if a host has physical interfaceeth0 (untagged), then the kernel can send and receive a tagged packets if a properly defined interface named

    30

  • Chapter 4. Defining and Modifying Networks and Network Interfaces

    eth0. has been created and properly configured. Tagging allows the same physical network to be partitionedinto many different networks. A key feature of VLAN tagging is that a broadcast packet (e.g. a DHCPDISCOVERpacket) only broadcasts on the tagged VLAN in which is was initially sent.

    Rocks supports two types of VLAN interfaces - the first is an explicit device name like eth0.10 that is defined on aparticular physical interface. The second is a logical device name of the form "vlan*". In Rocks, the physical VLANdevice can also have an IP address associated with it, however a logical VLAN device cannot. We use logicalVLANs to construct bridges suitable for virtual clusters.

    1. Explicit VLAN Devices of the form . can have IP addresses assigned

    2. Rocks-Specific: Logical VLAN Devices of the form "vlan*" CANNOT have IP address assigned

    4.4.1. Physical VLAN DevicesPhysical VLAN devices are interfaces associated with specific physical interfaces. While eth0 is used as an example,any physical IP interface can have a VLAN associated with it.

    FIGURE: Physical VLAN device called eth0.2. This device may be assigned an IP and a network name (e.g. "net")that is unrelated to the network name of the physical device (eth0). All packets sent on this interface will be taggedwith VLAN=2. Multiple Physical VLAN devices can be defined.

    Use the following example to add a physical VLAN device, assign a tag, and an IP address:

    [root@rocks ~]# rocks add host interface compute-0-0-1 iface=eth0.2 subnet=net2 ip=10.2.1.10[root@rocks ~]# rocks set host interface vlan compute-0-0-1 iface=eth0.2 vlan=2[root@rocks ~]# rocks list host interface compute-0-0-1SUBNET IFACE MAC IP NETMASK MODULE NAME VLANprivate eth0 00:16:3e:00:00:11 172.16.254.192 255.255.255.0 xennet compute-0-0-1 ------net2 eth0.2 ----------------- 10.2.1.10 255.255.255.0 ------ ------------- 2

    4.4.2. Logical VLAN DevicesThe second kind of VLAN interface that Rocks supports is what we call a logical VLAN device. The Virtual VLANgives the ability of having a raw interface with no IP address assigned that is generally used as a bridge for virtual

    31

  • Chapter 4. Defining and Modifying Networks and Network Interfaces

    machines. Virtual VLAN devices have their subnet=

    FIGURE: Virtual VLAN devices called vlan2 and vlan3. These types of devices may NOT have an IP address (Thisis a Rocks-specific construction).

    [root@rocks ~]# rocks add host interface compute-0-0-1 vlan2[root@rocks ~]# rocks add host interface compute-0-0-1 vlan3[root@rocks ~]# rocks set host interface vlan compute-0-0-1 vlan2 2[root@rocks ~]# rocks set host interface vlan compute-0-0-1 vlan3 3[root@rocks ~]# rocks list host interface compute-0-0-1SUBNET IFACE MAC IP NETMASK MODULE NAME VLANprivate eth0 00:16:3e:00:00:11 172.16.254.192 255.255.255.0 xennet compute-0-0-1 ------------- vlan2 ----------------- -------------- ------------- ------ ------------- 2------- vlan3 ----------------- -------------- ------------- ------ ------------- 3

    At this stage, the vlan interfaces are not currently associated with any physical network device. Linux will notconfigure these devices on the node without the association. We overload the meaning of subnet in this case to mean:"associate the logical vlan device with the physical device that is in subnet x". As an example, we can associateboth vlan2 and vlan3 to be tagged packet interfaces on the the subnet named private.

    [root@tranquil ~]# rocks set host interface subnet compute-0-0-1 vlan2 subnet=private[root@tranquil ~]# rocks set host interface subnet compute-0-0-1 vlan3 subnet=private[root@tranquil ~]# rocks list host interface compute-0-0-1SUBNET IFACE MAC IP NETMASK MODULE NAME VLANprivate eth0 00:16:3e:00:00:11 172.16.254.192 255.255.255.0 xennet compute-0-0-1 ------private vlan2 ----------------- -------------- ------------- ------ ------------- 2private vlan3 ----------------- -------------- ------------- ------ ------------- 3

    32

  • Chapter 4. Defining and Modifying Networks and Network Interfaces

    FIGURE: Virtual VLAN devices called vlan2 and vlan3 are associated with the physical device that is desigated assubnet private. Notice, that no netmask is associated with the vlan2 and vlan3 devices. These are raw, tagged packetinterfaces and are mostly used for bridges when hosting VMs.

    4.5. Network Bridging for Virtual MachinesRocks support of Virtual Machines requires the proper setup of networking bridges. Rocks supports multiplenetwork adapters for Virtual Machines, In this section, we describe the various kinds of bridging scenarios for virtualmachines and how to set them up. For these examples, the physical machine will be called vm-container-0-0,

    4.5.1. VM Network Bridging to Physical DevicesWhen a VM is bridged to the physical device, it must be assigned in the same subnet as the physical device with acompatible IP address

    FIGURE: The Virtual machine is bridged to eth0. In this case eth0 of the VM is in the same subnet (with acompatible IP) address. The VM and the container will be able to ping each other. This was the only configurationsupported in Rocks 5.0

    The following example shows this most basic of bridging scenarios. The guest (compute-0-0-1) and the container(vm-container-0-0) are in the same IP subnet and will be able to ping each other.

    [root@tranquil images]# rocks list host interface vm-container-0-0 compute-0-0-1HOST SUBNET IFACE MAC IP NETMASK MODULE NAME VLAN

    33

  • Chapter 4. Defining and Modifying Networks and Network Interfaces

    compute-0-0-1: private eth0 00:16:3e:00:00:11 172.16.254.192 255.255.255.0 xennet compute-0-0-1 ------vm-container-0-0: private eth0 00:09:6b:89:39:68 172.16.254.238 255.255.255.0 e1000 vm-container-0-0 ------

    4.5.2. Logical VLAN DevicesIn this scenario, The guest (hosted-vm-0-0-2) and the host (vm-container-0-0) are not in the same logical network.

    FIGURE: Guest VM is bridged through a logical VLAN device.

    [root@rocks ~]# rocks list host interface vm-container-0-0 hosted-vm-0-0-0HOST SUBNET IFACE MAC IP NETMASK MODULE NAME VLANhosted-vm-0-0-0: ------- eth0 00:16:3e:00:00:05 -------------- --------- ------ hosted-vm-0-0-0 2vm-container-0-0: private eth0 00:0e:0c:5d:7e:5e 10.255.255.254 255.0.0.0 e1000 vm-container-0-0 ------vm-container-0-0: private vlan2 ----------------- -------------- --------- ------ ---------------- 2

    In the above configuration, Logical VLAN device vlan2 (with tag=2) will be on the physical network eth0 onvm-container-0-0. The hosted-vm-0-0-0 (a Rocks "appliance" that simply holds a generic VM guest) will have haveits interface on VLAN=2. The physical machine must have a Logical vlan device with the same tag.

    Below we give a more complicated configuration and walk through exactly what is bridged where.

    [root@rocks ~]# rocks list host interface vm-container-0-0SUBNET IFACE MAC IP NETMASK MODULE NAME VLANprivate eth0 00:0e:0c:5d:7e:5e 10.255.255.254 255.0.0.0 e1000 vm-container-0-0 ------net10 eth1 00:10:18:31:74:84 192.168.1.10 255.255.255.0 tg3 vm-net10-0-0 ------net10 vlan100 ----------------- -------------- ------------- ------ ---------------- 100private vlan2 ----------------- -------------- ------------- ------ ---------------- 2

    [root@rocks ~]# rocks list host interface hosted-vm-0-0-0SUBNET IFACE MAC IP NETMASK MODULE NAME VLAN------ eth0 00:16:3e:00:00:05 -- ------- ------ hosted-vm-0-0-0 2------ eth1 00:16:3e:00:00:80 -- ------- ------ --------------- 100

    In the above scenario, if hosted-vm-0-0-0 (Xen guest, DomU) were to be booted on physical host vm-container-0-0(Dom0), the packets from the guest on eth0 will be tagged with VLAN=2, and eth1 with VLAN=100. The hostmachine must have Logical VLAN interfaces called "vlan*.". To make the proper bridge configuration, Rocks willmatch the VLANs of the guest interfaces to the VLANs on the host. On the host, logical interface vlan2 is labeled asbeing on the private network (eth0) and logical vlan100 is labeled as being on the net10 network (eth1).

    34

  • Chapter 4. Defining and Modifying Networks and Network Interfaces

    4.5.3. Networking for Virtual Clusters

    FIGURE: Multiple VMs communicating on a Logical VLAN interface.

    FIGURE: Fully Virtualized cluster, including a virtual frontend.

    4.6. Networking Configuration ExamplesIn this section, we describe some common networking configurations and how to use Rocks commands to set upvarious networking scenarios.

    4.6.1. Adding a public IP address to the second ethernet adapter ona compute nodeOften, owners want the second ethernet adapter to be on the public network and for the default routing to be in the

    35

  • Chapter 4. Defining and Modifying Networks and Network Interfaces

    public network. Assuming that the public network is 1.2.3.0/255.255.255.0 and the default gateway for that networkis 1.2.3.1, the following set of commands define the second interface of a compute to have address 1.2.3.25 withname mypublic.myuniversity.edu, update all required configuration files on the frontend, update all requiredconfiguration files on the node compute-0-0 and restart the network on compute-0-0.

    # rocks set host interface ip compute-0-0 iface=eth1 ip=1.2.3.25# rocks set host interface name compute-0-0 iface=eth1 name=mypublic.myuniversity.edu# rocks set host interface subnet compute-0-0 eth1 public# rocks add host route compute-0-0 1.2.3.0 eth1 netmask=255.255.255.0# rocks sync config# rocks sync host network compute-0-0

    4.6.2. Adding an IP network for local message passing.Often, users will want to use the second ethernet device for messaging passing. In this example, we illustrate creatinga named subnet and then scripting the IP assignment for a rack of 32 nodes with IP range of 192.168.1.10 ...192.168.1.41.

    rocks add network fast subnet=192.168.1.0 netmask=255.255.255.0IP=10NNODES=32NODE=0while [ $NODE -lt $NNODES ]; do \rocks set host interface ip compute-0-$NODE iface=eth1 ip=192.168.1.$IP; \rocks set host interface subnet compute-0-$NODE iface=eth1 subnet=fast; \rocks set host interface name compute-0-$NODE iface=eth1 name=compute-fast-0-$NODE; \rocks set host interface subnet compute-0-$NODE eth1 publiclet IP++; \let NODE++; \donerocks sync configrocks sync host network compute

    The above will add the named subnet called "fast", assign IP addresses sequentially, name the eth1 interface on eachnode, rewrite the DNS configuration (sync config) and finally rewrite and restart the network configuration on eachcompute appliance. This additional network configuration is persistent across re-installation of nodes.

    36

  • Chapter 5. Customizing your RocksInstallation

    5.1. Adding Packages to Compute NodesPut the package you want to add in:

    /export/rocks/install/contrib/5.4.3/arch/RPMS

    Where arch is your architecture ("i386" or "x86_64").

    Create a new XML configuration file that will extend the current compute.xml configuration file:

    # cd /export/rocks/install/site-profiles/5.4.3/nodes# cp skeleton.xml extend-compute.xml

    Inside extend-compute.xml, add the package name by changing the section from:

    to:

    your package

    It is important that you enter the base name of the package in extend-compute.xml and not the full name.

    For example, if the package you are adding is named XFree86-100dpi-fonts-4.2.0-6.47.i386.rpm, inputXFree86-100dpi-fonts as the package name in extend-compute.xml.

    XFree86-100dpi-fonts

    If you have multiple packages youd like to add, youll need a separate tag for each. For example, to addboth the 100 and 75 dpi fonts, the following lines should be in extend-compute.xml:

    XFree86-100dpi-fontsXFree86-75dpi-fonts

    Also, make sure that you remove any package lines which do not have a package in them. For example, the fileshould NOT contain any lines such as:

    37

  • Chapter 5. Customizing your Rocks Installation

    Now build a new Rocks distribution. This will bind the new package into a RedHat compatible distribution in thedirectory /export/rocks/install/rocks-dist/....

    # cd /export/rocks/install# rocks create distro

    Now, reinstall your compute nodes.

    5.1.1. Adding Specific Architecture Packages to Compute NodesOften on x86_64-based clusters, one wants to add the x86_64 and i386 version of a package to compute nodes. To dothis, in your extend-compute.xml file, supply the section:

    pkg.x86_64pkg.i386

    Where pkg is the basename of the package.

    Now build a new Rocks distribution.

    # cd /export/rocks/install# rocks create distro

    Now, reinstall your compute nodes.

    5.2. Customizing Configuration of Compute NodesCreate a new XML configuration file that will extend the current compute.xml configuration file:

    # cd /export/rocks/install/site-profiles/5.4.3/nodes/# cp skeleton.xml extend-compute.xml

    Inside extend-compute.xml, add your configuration scripts that will be run in the post configuration step of theRed Hat installer.

    Put your bash scripts in between the tags and :

    < !-- insert your scripts here -->

    To apply your customized configuration scripts to compute nodes, rebuild the distribution:

    # cd /export/rocks/install# rocks create distro

    Then, reinstall your compute nodes.

    38

  • Chapter 5. Customizing your Rocks Installation

    5.3. Adding Applications to Compute NodesIf you have code youd like to share among the compute nodes, but your code isnt in an RPM (or in a roll), then thisprocedure describes how you can share it with NFS.

    On the frontend, go to the directory /share/apps.

    # cd /share/apps

    Then add the files youd like to share within this directory.

    All files will also be available on the compute nodes under: /share/apps. For example:

    # cd /share/apps# touch myapp# ssh compute-0-0# cd /share/apps# lsmyapp

    5.4. Configuring Additional Ethernet InterfacesFor compute nodes, Rocks uses the first ethernet interface (eth0) for management (e.g., reinstallation), monitoring(e.g., Ganglia) and message passing (e.g., OpenMPI over ethernet). Often, compute nodes have more than oneethernet interface. This procedure describes how to configure them.

    Additional ethernet interfaces are configured from the frontend via the Rocks command line. It modifies entries inthe networks table on the frontend to add information about an extra interface on a node.

    Once you have the information in the networks table, every time you reinstall, the additional NIC will be configured.

    Suppose you have a compute node with one configured network (eth0) and one unconfigured network (eth1):

    # rocks list host interface compute-0-0SUBNET IFACE MAC IP NETMASK MODULE NAME VLANprivate eth0 00:1e:4f:b0:74:ef 10.1.255.254 255.255.0.0 tg3 compute-0-0 ------------- eth1 00:10:18:31:74:43 ------------ ----------- tg3 ----------- ------

    Well configure eth1 with the following network info and associate eth1 with the public subnet:

    Name = fast-0-0

    IP address = 192.168.1.1

    # rocks set host interface ip compute-0-0 eth1 192.168.1.1# rocks set host interface name compute-0-0 eth1 fast-0-0

    Now well create a new network and associate it with the new interface:

    # rocks add network fast 192.168.1.0 255.255.255.0

    And then well check our work:

    39

  • Chapter 5. Customizing your Rocks Installation

    # rocks list networkNETWORK SUBNET NETMASK MTUprivate: 10.1.0.0 255.255.0.0 1500public: 137.110.119.0 255.255.255.0 1500fast: 192.168.1.0 255.255.255.0 1500

    Now associate the new network to eth1.

    # rocks set host interface subnet compute-0-0 eth1 fast

    The interface eth1 is now configured:

    # rocks list host interface compute-0-0SUBNET IFACE MAC IP NETMASK MODULE NAME VLANprivate eth0 00:1e:4f:b0:74:ef 10.1.255.254 255.255.0.0 tg3 compute-0-0 ------fast eth1 00:10:18:31:74:43 192.168.1.1 255.255.255.0 tg3 fast-0-0 ------

    After specifying new network settings to a compute-0-0, execute the following command to apply the settings:

    # rocks sync config# rocks sync host network compute-0-0

    If you configuring the interface to another public network, you can set the gateway for the interface with therocks add host route command.

    For example, to set the route for the 192.168.1.0 network to 192.168.1.254 for compute-0-0, youd execute:

    # rocks add host route compute-0-0 192.168.1.0 192.168.1.254 netmask=255.255.255.0

    5.5. Compute Node Disk Partitioning

    5.5.1. Default Disk PartitioningThe default root partition is 16 GB, the default swap partition is 1 GB, and the default /var partition is 4 GB. Theremainder of the root disk is setup as the partition /state/partition1.

    Only the root disk (the first discovered disk) is partitioned by default. To partition all disks connected to a computenode, see the section Forcing the Default Partitioning Scheme for All Disks on a Compute Node.

    Table 5-1. Compute Node -- Default Root Disk Partition

    Partition Name Size/ 16 GB

    40

  • Chapter 5. Customizing your Rocks Installation

    Partition Name Sizeswap 1 GB

    /var 4 GB

    /state/partition1 remainder of root disk

    After the initial installation, all data in the file systems labeled /state/partitionX will be preserved overreinstallations.

    5.5.2. Customizing Compute Node Disk PartitionsIn Rocks, to supply custom partitioning to a node, one must write code in a section and the code must create afile named /tmp/user_partition_info. Red Hat kickstart partitioning directives should be placed inside/tmp/user_partition_info. This allows users to fully program their cluster nodes partitions. In the examplesbelow, well explore what this means.

    5.5.2.1. Single Disk Example

    Create a new XML node file that will replace the current partition.xml XML node file:

    # cd /export/rocks/install/site-profiles/5.4.3/nodes/# cp skeleton.xml replace-partition.xml

    Inside replace-partition.xml, add the following section right after the section:

    echo "clearpart --all --initlabel --drives=hdapart / --size 8000 --ondisk hdapart swap --size 1000 --ondisk hdapart /mydata --size 1 --grow --ondisk hda" > /tmp/user_partition_info

    The above example uses a bash script to populate /tmp/user_partition_info. This will set up an 8 GB rootpartition, a 1 GB swap partition, and the remainder of the drive will be set up as /mydata. Additional drives on yourcompute nodes can be setup in a similar manner by changing the --ondisk parameter.

    In the above example, the syntax of the data in /tmp/user_partition_info follows directly from Red Hatskickstart. For more information on the part keyword, see Red Hat Enterprise Linux 5 Installation Guide : KickstartOptions1.

    41

  • Chapter 5. Customizing your Rocks Installation

    User-specified partition mountpoint names (e.g., /mydata) cannot be longer than 15 characters.

    Then apply this configuration to the distribution by executing:

    # cd /export/rocks/install# rocks create distro

    To reformat compute node compute-0-0 to your specification above, youll need to first remove the partition infofor compute-0-0 from the database:

    # rocks remove host partition compute-0-0

    Then youll need to remove the file .rocks-release from the first partition of each disk on the compute node.Heres an example script:

    for file in $(mount | awk {print $3})doif [ -f $file/.rocks-release ]thenrm -f $file/.rocks-releasefidone

    Save the above script as /share/apps/nukeit.sh and then execute:

    # ssh compute-0-0 sh /share/apps/nukeit.sh

    Then, reinstall the node:

    # ssh compute-0-0 /boot/kickstart/cluster-kickstart

    5.5.2.2. Software Raid Example

    If you would like to use software RAID on your compute nodes, inside replace-partition.xml add a sectionthat looks like:

    echo "clearpart --all --initlabel --drives=hda,hdbpart / --size 8000 --ondisk hdapart swap --size 1000 --ondisk hda

    part raid.00 --size=10000 --ondisk hdapart raid.01 --size=10000 --ondisk hdb

    raid /mydata --level=1 --device=md0 raid.00 raid.01" > /tmp/user_partition_info

    Then apply this configuration to the distribution by executing:

    # cd /export/rocks/install

    42

  • Chapter 5. Customizing your Rocks Installation

    # rocks create distro

    To reformat compute node compute-0-0 to your specification above, youll need to first remove the partition infofor compute-0-0 from the database:

    # rocks remove host partition compute-0-0

    Then youll need to remove the file .rocks-release from the first partition of each disk on the compute node.Heres an example script:

    for file in $(mount | awk {print $3})doif [ -f $file/.rocks-release ]thenrm -f $file/.rocks-releasefidone

    Save the above script as /share/apps/nukeit.sh and then execute:

    # ssh compute-0-0 sh /share/apps/nukeit.sh

    Then, reinstall the node:

    # ssh compute-0-0 /boot/kickstart/cluster-kickstart

    5.5.2.3. Programmable Partitioning

    Some issues with the above two examples are that 1) you must know the name of the disk device (e.g., hda) and, 2)the partitioning will be applied to all nodes. We can avoid these issues by writing a python program that emitsnode-specific partitioning directives.

    In the next example, well use some Rocks partitioning library code to dynamically determine the name of the bootdisk.

    import rocks_partition

    membership = &membership;nodename = &hostname;

    def doDisk(file, disk):file.write(clearpart --all --initlabel --drives=%s\n % disk)file.write(part / --size=6000 --fstype=ext3 --ondisk=%s\n % disk)file.write(part /var --size=2000 --fstype=ext3 --ondisk=%s\n % disk)file.write(part swap --size=2000 --ondisk=%s\n % disk)file.write(part /mydata --size=1 --grow --fstype=ext3 --ondisk=%s\n

    % disk)

    ## main

    43

  • Chapter 5. Customizing your Rocks Installation

    #p = rocks_partition.RocksPartition()disks = p.getDisks()

    if len(disks) == 1:file = open(/tmp/user_partition_info, w)doDisk(file, disks[0])file.close()

    The function getDisks() returns a list of discovered disks. In the code sample above, if only one disk is discoveredon the node, then the function doDisk is called which outputs partitioning directives for a single disk. This codesegment will work for nodes with IDE or SCSI controllers. For example, a node with a IDE controller will name itsdisks hdX and a node with SCSI controllers will name its disks sdX. But, the code segment above doesnt care howthe node names its drives, it only cares if one drive is discovered.

    The next example shows how a node can automatically configure a node for software raid when it discovers 2 disks.But, if the node only discovers 1 disk, it will output partitioning info appropriate for a single-disk system.

    import rocks_partition

    membership = &membership;nodename = &hostname;

    def doRaid(file, disks):file.write(clearpart --all --initlabel --drives=%s\n

    % ,.join(disks))

    raidparts = []

    for disk in disks:if disk == disks[0]:

    part = part / --size=6000 --fstype=ext3 + \--ondisk=%s\n % disk

    file.write(part)

    part = part /var --size=2000 --fstype=ext3 + \--ondisk=%s\n % disk

    file.write(part)

    part = part raid.%s --size=5000 --ondisk=%s\n % (disk, disk)file.write(part)

    raidparts.append(raid.%s % disk)

    raid = raid /bigdisk --fstype=ext3 --device=md0 --level=1 %s\n \% .join(raidparts)

    file.write(raid)

    def doDisk(file, disk):file.write(clearpart --all --initlabel --drives=%s\n % disk)

    44

  • Chapter 5. Customizing your Rocks Installation

    file.write(part / --size=6000 --fstype=ext3 --ondisk=%s\n % disk)file.write(part /var --size=2000 --fstype=ext3 --ondisk=%s\n % disk)file.write(part swap --size=2000 --ondisk=%s\n % disk)file.write(part /mydata --size=1 --grow --fstype=ext3 --ondisk=%s\n

    % disk)

    ## main#p = rocks_partition.RocksPartition()disks = p.getDisks()

    file = open(/tmp/user_partition_info, w)

    if len(disks) == 2:doRaid(file, disks)elif len(disks) == 1:doDisk(file, disks[0])

    file.close()

    If the node has 2 disks (if len(disks) == 2:), then call doRaid() to configure a software raid 1 over the 2disks. If the node has 1 disk then call doDisk() and output partitioning directives for a single disk.

    In the next example, we show how to output user-specified partitioning info for only one specific node(compute-0-0). All other nodes that execute this pre section will get the default Rocks partitioning.

    import rocks_partition

    membership = &membership;nodename = &hostname;

    def doRaid(file, disks):file.write(clearpart --all --initlabel --drives=%s\n

    % ,.join(disks))

    raidparts = []

    for disk in disks:if disk == disks[0]:

    part = part / --size=6000 --fstype=ext3 + \--ondisk=%s\n % disk

    file.write(part)

    part = part /var --size=2000 --fstype=ext3 + \--ondisk=%s\n % disk

    file.write(part)

    part = part raid.%s --size=5000 --ondisk=%s\n % (disk, disk)

    45

  • Chapter 5. Customizing your Rocks Installation

    file.write(part)

    raidparts.append(raid.%s % disk)

    raid = raid /bigdisk --fstype=ext3 --device=md0 --level=1 %s\n \% .join(raidparts)

    file.write(raid)

    def doDisk(file, disk):file.write(clearpart --all --initlabel --drives=%s\n % disk)file.write(part / --size=6000 --fstype=ext3 --ondisk=%s\n % disk)file.write(part /var --size=2000 --fstype=ext3 --ondisk=%s\n % disk)file.write(part swap --size=2000 --ondisk=%s\n % disk)file.write(part /mydata --size=1 --grow --fstype=ext3 --ondisk=%s\n

    % disk)

    ## main#p = rocks_partition.RocksPartition()disks = p.getDisks()

    if nodename in [ compute-0-0 ]:file = open(/tmp/user_partition_info, w)

    if len(disks) == 2:doRaid(file, disks)

    elif len(disks) == 1:doDisk(file, disks[0])

    file.close()

    5.5.3. Forcing the Default Partitioning Scheme for All Disks on aCompute NodeThis procedure describes how to force all the disks connected to a compute node back to the default Rockspartitioning scheme regardless of the current state of the disk drive on the compute node.

    The root disk will be partitioned as described in Default Partitioning and all remaining disk drives will have onepartition with the name /state/partition2, /state/partition3, ...

    For example, the following table describes the default partitioning for a compute node with 3 SCSI drives.

    46

  • Chapter 5. Customizing your Rocks Installation

    Device Name Mountpoint Size

    Table 5-2. A Compute Node with 3 SCSI Drives

    Device Name Mountpoint Size/dev/sda1 / 16 GB

    /dev/sda2 swap 1 GB

    /dev/sda3 /var 4 GB

    /dev/sda4 /state/partition1 remainder of root disk

    /dev/sdb1 /state/partition2 size of disk

    /dev/sdc1 /state/partition3 size of disk

    Create a new XML configuration file that will replace the current partition.xml configuration file:

    # cd /export/rocks/install/site-profiles/5.4.3/nodes/# cp skeleton.xml replace-partition.xml

    Inside replace-partition.xml, add the following section:

    echo "rocks force-default" > /tmp/user_partition_info

    Then apply this configuration to the distribution by executing:

    # cd /export/rocks/install# rocks create distro

    To reformat compute node compute-0-0 to your specification above, youll need to first remove the partition infofor compute-0-0 from the database:

    # rocks remove host partition compute-0-0

    Then youll need to remove the file .rocks-release from the first partition of each disk on the compute node.Heres an example script:

    for file in $(mount | awk {print $3})doif [ -f $file/.rocks-release ]thenrm -f $file/.rocks-releasefidone

    Save the above script as /share/apps/nukeit.sh and then execute:

    # ssh compute-0-0 sh /share/apps/nukeit.sh

    Then, reinstall the node:

    47

  • Chapter 5. Customizing your Rocks Installation

    # ssh compute-0-0 /boot/kickstart/cluster-kickstart

    After you have returned all the compute nodes to the default partitioning scheme, then youll want to removereplace-partition.xml in order to allow Rocks to preserve all non-root partition data.

    # rm /export/rocks/install/site-profiles/5.4.3/nodes/replace-partition.xml

    Then apply this update to the distribution by executing:

    # cd /export/rocks/install# rocks create distro

    5.5.4. Forcing Manual Partitioning Scheme on a Compute NodeThis procedure describes how to force a compute node to always display the manual partitioning screen duringinstall. This is useful when you want full and explicit control over a nodes partitioning.

    Create a new XML configuration file that will replace the current partition.xml configuration file:

    # cd /export/rocks/install/site-profiles/5.4.3/nodes/# cp skeleton.xml replace-partition.xml

    Inside replace-partition.xml, add the following section:

    echo "rocks manual" > /tmp/user_partition_info

    Then apply this configuration to the distribution by executing:

    # cd /export/rocks/install# rocks create distro

    The next time you install a compute node, you will see the screen:

    48

  • Chapter 5. Customizing your Rocks Installation

    To interact with the above screen, from the frontend execute the command:

    # rocks-console compute-0-0

    5.6. Creating a Custom Kernel RPM

    5.6.1. Creating a Custom Kernel RPM using kernel.orgs Source

    On the frontend, check out the Rocks source code. See Access to Rocks Source Code for details.

    Change into the directory:

    # cd rocks/src/roll/kernel/src/kernel.org

    Download the kernel source tarball from kernel.org. For example:

    # wget http://www.kernel.org/pub/linux/kernel/v2.6/linux-2.6.24.4.tar.gz

    Create a kernel "config" file and put it in config-

    You can create the config file by using the following procedure:

    # tar xzf linux-2.6.24.4.tar.gz# cd linux-2.6.24.4# make menuconfig

    49

  • Chapter 5. Customizing your Rocks Installation

    Configure the kernel anyway you need, and after the configuration is over choose to save the configuration in analternative location. Enter the name of the file as ../config-2.6.24.4. Finally, exit the configuration andremove the linux-2.6.24.4 directory.

    The number must match the version number of the kernel source. For example, if you downloadedlinux-2.6.24.4.tar.gz, the name of the config file must be config-2.6.24.4.

    Update version.mk.

    The file version.mk has the following contents:

    NAME = kernelRELEASE = 1

    VERSION = 2.6.24.4PAE = 0XEN = 0

    The VERSION value must match that of the linux kernel tarball you downloaded (e.g., 2.6.24.4).

    If you are building a kernel for an i386 system that has more than 4 GB, youll need to set the PAE (page addressextension) flag. This will name the resulting kernel kernel-PAE*rpm. If the anaconda installer detects more that 4GB memory, then it will install the kernel-PAE RPM and not the kernel RPM