Data Domain Student Guide

download Data Domain Student Guide

of 288

Transcript of Data Domain Student Guide

  • Backup Recovery Systems DivisionEMC Data Domain2421 Mission College BoulevardSanta Clara, CA 95054866-WE-DEDUPE408-980-4800www.datadomain.com

    EMC Data Domain System Administration Course

    April 2011

    MR-1CN-DDSAADMIN-50G1

  • EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without notice. THE INFORMATION IN THIS PUBLICATION IS PROVIDED AS IS. EMC CORPORATION MAKES NO REPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO THE INFORMATION IN THIS PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Using, copying, and distributing EMC software described in this publication requires an applicable software license. EMC2, EMC, Data Domain, Global Compression, SISL, the EMC logo, and where information lives are registered trademarks or trademarks of EMC Corporation in the United States and other countries. All other trademarks used herein are the property of their respective owners. Copyright 2009-2011EMC Corporation. All rights reserved. Published in the USA.

  • ContentsContents 1

    Module 0: Course Introduction 9Module Objectives 10Lesson 1: Course Introduction 11

    Covered Skills 12Course Objectives 13Course Objectives (Continued) 14Course Content 15Course Content (Continued) 17Daily Agenda 18Guides, Introductions, and Orientation 19

    Lesson 2: VDC 20VDC Introduction 21Access VDC 22Lab 0.1: Access the VDC 23

    Module 1: Deduplication 25Module Introduction 25Module Objectives 26Deduplication Simplified Definition 27Inline Vs. Post-Process Deduplication 28

    Inline Deduplication 28Inline Deduplication Process 29Post Processing Deduplication 29Post-Process Deduplication Process 29

    File Based Deduplication 30Fixed Segment Deduplication 31Variable Segment Size Deduplication 32Module Review 33

    Module 2: Data Domain Operating Environment 35Module Introduction 35Lesson 1: Data Domain Deduplication 36

    How Deduplication Works 37Lesson 2: SISL 39

    SISL Definition 40Deduplication 41Deduplication (Continued) 42

    Lesson 3: DIA 43Definition 44DIA End-to-End Verification 45DIA Fault Avoidance and Containment 46DIA Continuous Fault Detection and Healing 47DIA File System Recovery 48

  • Lesson 4: File Systems 49Administration File System 502 EMC Data Domain System Administration Course Student Guide

    Storage File System 51Storage File System (Continued) 52Storage File System (Continued) 53

    Lesson 5: Data Domain Data Paths 54Data Domain System in Typical Backup Environments 55Data Path over Ethernet 56Data Path over Fibre Channel VTL 57

    Lesson 6: Administration Interfaces 58Access Enterprise Manager 59Enterprise Manager Main Screen 60Enterprise Manager Tabs 61CLI 62CLI (Continued) 63

    Module Review 64Module Review (Continued) 65Module Review (Continued) 66

    Module 3: Initial Configuration and Backup 67Module Introduction 67Module Objectives 68Lesson 1: Verify Initial Configuration 69

    Launch Enterprise Manager 70Launch Configuration Wizard 71Lab 3.1: Initial Configuration 72

    Lesson 2: Manage System Access 73User Classes 74Manage Administration Access Protocols 76Create a User 77

    Lesson 3: Configure CIFS/NFS 79Configure a CIFS share 80Configure an NFS Export 81

    Lesson 4: Verify Hardware 83Model Number, System Uptime, Serial Number 84Storage (Disk) Status 85Disk Overview 86Active Tier 87Locate a Disk 88View Usable Disks 89View Failed, Foreign, or Absent Disks 90View Chassis Status 91Lab 3.2: Copy Data to a Data Domain System 93

    Module Review 94

    Module 4: Manage Network Interfaces 95Module Introduction and Objectives 95

  • Lesson 1: Interfaces, Settings, & Routes 96Manage Network Routes 97Contents 3

    Create Static Routes 99Lesson 2: Link Aggregation 101

    Definition 101Requirements 102Create a Virtual Interface for Link Aggregation 103

    Lesson 3: Link Failover 106Specifications 106Manage Link Failover 108Manage Link Failover (Continued) 109

    Lesson 4: Manage VLAN and IP Alias Network Interfaces 110Introduction and Objectives 110Define VLAN and IP Alias Network Interfaces 111Create VLAN and IP Aliases 112Create VLAN and IP Aliases (Continued) 113Create VLAN and IP Aliases (Continued) 115

    Module Review 117Module Review (Continued) 118Module Review (Continued) 119

    Module 5: Manage a VTL 121Module Introduction and Objectives 121VTL Overview 122Configuration Terms 124VTL Planning 125Capacity Planning 126Create Tapes 127VTL Barcode Definition 128Configure VTL Barcode 129Configure VTL Barcode (Continued) 130Configure VTL Barcode (Continued) 131Module Review 132

    Module 6: Manage Data 133Module Introduction and Objectives 133Lesson 1: Snapshots 134

    Manage Snapshots 136Lab 6.1 Configure Snapshot 137

    Lesson 2: Fastcopy 138Perform Fastcopy 139Lab 6.2: Perform Fastcopy 140

    Lesson 3: Retention Lock 141Retention Lock (Continued) 142Retention Lock (Continued) 143Configure Retention Lock 144

    Lesson 3: Sanitization 145

  • Lab 6.3: Configure Sanitization 147Lesson 4: Encryption of Data at Rest 1484 EMC Data Domain System Administration Course Student Guide

    Passphrase and Encryption Key 148File System Locking 149Encryption Flow 150Configure Encryption 151Apply Encryption Changes 152Deactivate Encryption 153Lab 6.4 Configure Encryption 154

    Lesson 5: File System Cleaning 155Cleaning 156Lab 6.5: Configure File-System Cleaning 158

    Lesson 6: Monitor File-System Space Usage 159File System Summary Tab 160Space Usage 161

    Space Usage Terms 161Space Consumption Tab 163

    Space Consumption Terms 163Daily Written Tab 164

    Daily Written Tab Terms 164Module Review 165Module Review (Continued) 166

    Module 7: Manage Data Replication and Recovery 167Module Introduction and Objectives 167Lesson 1: Data Replication 168

    Lesson Objectives 169Data Replication Overview 170Data Domain Replication Types 171Collection Replication 172Directory Replication 173Pool Replication 174Replication Context 175Replication Context Streams 176Replication Topologies 177Configure General Replication 179Configured Advanced Replication 180Replication Seeding 181Low Bandwidth Optimization Benefits 183Low Bandwidth Optimization Using Delta Compression 184Encryption Over Wire 186Lab 7.1: Replication 187

    Lesson 2: Recover Data 188Recover Replication-Pair Data 189Why Resynchronize Recovered Data? 190Resynchronization Process 191

  • Manage Throttle Settings 192Module Review 194Contents 5

    Module Review (Continued) 195

    Module 8: Manage DD Boost 197Module Introduction and Objectives 197DD Boost Overview 198DD Boost Overview (Continued) 199DD Boost Flow 200Replica Awareness Flow 202DD Boost Replication Awareness Advantages 203Deduplication With Distributed Segment Processing 205NetWorker Data Zone Architecture 207NetWorker Server Architecture 209Networker Work Flow 210Interface Groups 211Firewall Ports 212Download DD Boost Plug-In Software 213Configure DD Boost 214Configure DD Boost (Continued) 215Lab 8.1: DD Boost 216Module Review 217

    Module 9: Plan Capacity and Throughput, Monitor Throughput 219Module Introduction and Objective 219Lesson 1: Plan Capacity 220

    Collect Information 221Collect Information (Continued) 222Determine Capacity Needs 223Compression Requirements with Variables 224Compression Requirements with Variables (Continued) 225Calculate Required Capacity 226Capacity Requirements Calculation (Page 1 of 2) 227Capacity Requirements Calculation (Page 2 of 2) 228Calculate Required Throughput 229System Model Capacity and Performance 230Select Model 231Calculate Capacity Buffer for Selected Models 232Match Required Capacity to Model Specifications 233Calculate Performance Buffer for Selected Models 234Match Required Capacity to Model Specifications 235

    Lesson 2: Throughput Tuning 236Throughput Bottlenecks 237Performance Metrics 238Tuning Solutions 239Monitor Throughput 240

    Module Review 241

  • Module 10: Monitor a Data Domain System 243Module Introduction 2436 EMC Data Domain System Administration Course Student Guide

    Module Objectives 244Lesson 1: Monitor a Data Domain System Using SNMP 245

    SNMP Flow 246Download and Configure MIB File 247Lab 10.1: Monitor Using SNMP 248

    Lesson 2: Syslog (Remote Logging) 249Configure Remote Logging 250Lab 10.2: Monitor Using Syslog 251

    Lesson 3: Log Files 252ddvar log files 253Log File Types 254Lab 10.3: Manage Log Files 255

    Lesson 4: Support Bundles 256Generate Support Bundles 257

    Lesson 5: Autosupport 258Autosupport System 259Autosupport Types 260Autosupport Via Enterprise Manager 261Autosupport Reports 262Autosupport Reports (Continued) 263Autosupport Reports (Continued) 264Detailed Autosupport Report Contents 265Daily Summary Autosupport 266Alerts 267Alerts (Continued) 268Alerts (Continued) 269Alert Message Types 270Alerts Notification 271Find Autosupport Information 272Find System Autosupports 273Autosupport Device Symbols and Display Options 274View Space Plot, Autosupports, and Support Cases 275Lab 10.4: Autosupport 276

    Module Review 277Module Review (Continued) 278

    Module 11: Upgrade a Data Domain System 279Module Introduction and Objective 279

    Download Software 280System Upgrade 281

    Appendix A: Licenses 283

    Appendix B: Further Reading 285Access System Documentation 285

  • Access Integration Documentation 285Access Part Replacement Documentation 286Contents 7

  • 8 EMC Data Domain System Administration Course Student Guide

  • Module 0: Course Introduction

    The Data Domain System Administration course provides you with the knowledge and skills you need

    to maintain Data Domain systems. This course includes:

    Lecture

    Demonstrations

    Hands-on lab exercises (see lab guide)

    Reviews

    Pointers to Data Domain system documentationEMC Data Domain System Administration Course Student Guide 9

  • Module Objectives

    After completing this module, you should be able to:

    Describe this course

    Access the EMC Virtual Data Center (VDC)10 Module 00: Course Introduction

  • Lesson 1: Course Introduction

    After completing lesson 1, you should be able to:

    Describe the objectives for this course

    Describe the course content

    Describe the daily agenda

    Identify the other students

    Identify your course materialsEMC Data Domain System Administration Course Student Guide 11

  • Covered Skills

    This course focuses on Data Domain system:

    Concepts

    Basic configuration

    System monitoring

    Other courses in the Data Domain curriculum cover:

    Design

    Installation

    3rd-party application integration

    Troubleshooting

    Parts replacement12 Module 00: Course Introduction

  • Course ObjectivesEMC Data Domain System Administration Course Student Guide 13

  • Course Objectives (Continued)14 Module 00: Course Introduction

  • Course Content

    This course contains 11 modules. Each module contains lecture and review. Some modules contain labs.

    Module 0 (this module) is the course introduction. It:

    Gives a course overview.

    Introduces course materials

    Provides instructions to access the VDC.

    Module 1 covers deduplication (not Data Domain specific)

    Module 2 gives a product overview. It includes an overview of:

    The Data Domain operating system

    The Data Domain file system

    Data Domain data paths

    Module 3 covers how to perform the initial setup of a Data Domain system and how to do a file-system based backup.EMC Data Domain System Administration Course Student Guide 15

  • Module 4 covers how to manage a Data Domain system network.

    Module 5 covers how to configure a VTL.

    Module 6 covers how to manage data.16 Module 00: Course Introduction

  • Course Content (Continued)

    Module 7 covers how to perform data replication and recovery.

    Module 8 covers DD Boost

    Module 9 covers planning for capacity

    Module 10 covers monitoring a Data Domain system

    Module 11 covers upgrading the Data Domain operating system

    Appendix A lists Data Domain software licenses

    Appendix B tells you how to find Data Domain documentationEMC Data Domain System Administration Course Student Guide 17

  • Daily Agenda

    This is a typical agenda. Your instructor may speak with you about variances.18 Module 00: Course Introduction

  • Guides, Introductions, and Orientation

    Use your student guide to follow the lecture and take notes.

    Use your lab guide for step-by-step instructions on the labs.EMC Data Domain System Administration Course Student Guide 19

  • Lesson 2: VDC

    EMC2 Education Services provides a virtual data center (VDC) for you to use during labs. The VDC gives you access to Data Domain systems.20 Module 00: Course Introduction

  • VDC Introduction

    The VDC provides Microsoft Windows and Linux virtual machines (VMs). EMC Data Domain System Administration Course Student Guide 21

  • Access VDC

    Your instructor will give you a user name and password to access the VDC.22 Module 00: Course Introduction

  • Lab 0.1: Access the VDC

    Once your instructor gives you your user name, password and lab introduction, locate your lab guide and follow the step-by-step instructions to complete this lab.EMC Data Domain System Administration Course Student Guide 23

  • 24 Module 00: Course Introduction

  • Module 1: Deduplication

    Module Introduction

    This module introduces you to general deduplication. Data Domain system deduplication is discussed in

    the next module. This module reviews concepts that you learned about in the EMC Data Domain Technol-ogy and Systems Introduction (eLeanring) course.EMC Data Domain System Administration Course Student Guide 25

  • Module Objectives

    Deduplication is an important technology that improves data storage by providing extremely efficient data

    backups and archiving. In this module you learn more about general deduplication.26 Module 1: Deduplication

  • Deduplication Simplified Definition

    Deduplication eliminates redundant data because it stores only one instance of data. For example and

    only an example, this isnt a precise definition the sentence Mary had a little lamb gets stored as

    Mary hd lite mb. No second instances of letters get stored.

    Deduplication recognizes and deletes common elements in data. It stores only one copy of the duplicated

    data. It looks at each segment of an incoming data stream to determine if it needs to be stored or if it can

    be replaced by a smaller reference to a segment that is already on the disk.EMC Data Domain System Administration Course Student Guide 27

  • Inline Vs. Post-Process Deduplication

    Inline Deduplication

    It is simpler to use inline deduplication. Data is filtered before its stored to disk, so its like a regular stor-

    age system (it just writes and reads data). Theres no separate administration involved (managing multiple

    pools some with deduplication, some with regular storage or managing the conditions between

    them).

    Incoming data is examined as soon as it arrives to determine if the segment is new or unique or a dupli-

    cate of a segment previously stored.

    Inline deduplication occurs in RAM before the data is written to disk. Around ninety-nine percent of data

    segments are analyzed in RAM without disk access. A very small amount of data is not identified immedi-

    ately as either unique or redundant. That data is stored to disk and examined again later against the

    already stored data.

    Because deduplication is done with limited disk access, the speed of inline deduplication is not limited by

    disk seek times. Stream speed is as fast as other virtual tape library products that do not have deduplica-

    tion.28 Module 1: Deduplication

  • Inline Deduplication Process

    1. Inbound segments are analyzed in RAM.

    2. If a segment is redundant, a reference to the stored duplicate segment is created.

    3. If a segment is unique, it is compressed and stored.

    4. If a segment cannot be identified as unique or redundant, the segment is stored and re-examined later.

    Post Processing Deduplication

    Requires disk space to initially capture the data.

    Requires more disk space to store multiple pools of data.

    Requires more disks for speed when the deduplication approach is spindle bound.

    Post-Process Deduplication Process

    1. Data is first buffered to a large cache.

    2. Deduplication is then run as a separate processing task this could lengthen the time needed to fully complete the backup.

    3. After the data is stored, its read back internally, deduplicated, and written again to a different area.EMC Data Domain System Administration Course Student Guide 29

  • File Based Deduplication

    With file-based deduplication, if 2 files are exactly alike, 1 file is stored and future interations of the file are

    pointed to the original file.

    File-based deduplication doesnt segment data like a Data Domain system, nor does it chunk data like an

    Avamar system. It certain situations it can be efficient. However, it often doesnt result in as large a data

    reduction as other deuplication methods.30 Module 1: Deduplication

  • Fixed Segment Deduplication

    With fixed-segment deduplication, if you add a segment, the entire data stream moves.EMC Data Domain System Administration Course Student Guide 31

  • Variable Segment Size Deduplication

    Sub-file (segment) deduplication:

    Analyses data backup streams

    Breaks file into smaller fixed or variable sized segments. Smaller segments make it easier for Data Domain systems to find duplicates.

    Is good for backup data stores

    Compares backup data against existing data segments

    Is commonly used as a quick fix for backup problems

    Is more efficient than file based deduplication

    Variable segment sized deduplication if better than fixed-segment size deduplication because you can add data to a variable segment and it doesnt move a data stream.

    If you add data to a fixed segment, the entire data stream moves.32 Module 1: Deduplication

  • Module ReviewEMC Data Domain System Administration Course Student Guide 33

  • 34 Module 1: Deduplication

  • Module 2: Data Domain Operating Environment

    Module Introduction

    This module introduces you to the Data Domain system. It reviews concepts that you learned about in the EMC Data Domain Systems and Technology Introduction (eLearning) course.

    This module includes 6 lessons:

    1. Data Domain Deduplication

    2. SISL

    3. DIA

    4. File Systems

    5. Data Domain Data Paths

    6. Administration InterfacesEMC Data Domain System Administration Course Student Guide 35

  • Lesson 1: Data Domain Deduplication

    Data Domain systems are disk-based deduplication appliances. In this lesson you will learn more about deduplication.36 Module 2: Data Domain Operating Environment

  • How Deduplication Works

    The end result of identifying unique data segments and compressing the unique data before storage is a significant reduction in the size of the data stored on disk. Because of the size reductions, data can be retained on disk on site. The reduced data size also makes WAN vaulting possible because of up to 99% bandwidth reduction.

    Data Domain global compression technology reduces the data footprint by applying a combination of deduplication and local compression.

    Deduplication works by breaking the data into segments and then identifying the unique segments. Local compression is performed on the unique segments using standard compression algorithms before the unique data is written to disk.

    For example, in the illustration of the first full backup, two segments labeled B are the same, so segments A, B, C, and D are stored with a reference to B instead of storing a second copy.

    The compression factor at this point is the ratio of the size of the original 5 segments received (A+B+C+B again+D) to the size of the 4 segments (A+B+C+D) stored on disk. Usual data reduction for a first full backup is 3-4x.EMC Data Domain System Administration Course Student Guide 37

  • The next backup in the illustration is incremental. The backup includes copies of A and B as well as a new segment E. Only the new segment E is stored. A and B are stored as references to the previously stored segments. The compression factor of this backup is quite good since it is the ratio of the 3 received segments (A+B+E) to the single stored segment E. Usual data reduction for a file level incremental is 6-7x.

    The second full backup is when the reductions start to become very large. A,B,C,D and E are recognized as

    duplicates from the previous two backups, and only the new segment F gets stored. The compression factor of this second full backup is very high, with 6 segments coming in but only the one new segment getting stored. Usual data reduction for the second full backup is 50-60x.

    The compression factor over all three backups is the ratio of all 14 segments coming from the backup software to be stored to the 6 segments that get stored to represent all the data received over time.

    Deduplication is the process of recognizing common elements in the many versions and copies of data and eliminating the redundant copies of those common elements.

    With deduplication only one copy of duplicated data is stored.

    Because deduplication is performed with limited disk access, the speed of in line deduplication is not limited by disk seek times. Stream speed of Data Domain systems is as fast as other virtual tape library products that do not have deduplication. The process is shown as follows:

    Inbound data is analyzed in RAM.

    If data is redundant, a reference to the stored duplicate data is created.

    If data is unique, it is compressed and stored.

    If data cannot be identified as unique or redundant, the data is stored and re-examined later.

    For IT administrators, deduplication means there are fewer and smaller storage systems to manage, smaller data centers to run, fewer tapes to handle, fewer tape pickups, smaller network pipes, and cheaper WAN links.38 Module 2: Data Domain Operating Environment

  • Lesson 2: SISLEMC Data Domain System Administration Course Student Guide 39

  • SISL Definition40 Module 2: Data Domain Operating Environment

  • Deduplication

    The Data Domain system:

    1. Segments

    2. Fingerprints

    3. Filters

    4. Compresses

    5. Writes data to containers, containers written to diskEMC Data Domain System Administration Course Student Guide 41

  • Deduplication (Continued)

    The Data Domain system:

    1. Segments

    2. Fingerprints

    3. Filters

    4. Compresses

    5. Writes data to container, container written to disk42 Module 2: Data Domain Operating Environment

  • Lesson 3: DIAEMC Data Domain System Administration Course Student Guide 43

  • Definition44 Module 2: Data Domain Operating Environment

  • DIA End-to-End Verification

    The flow for continuous fault detection and healing is:

    1. The Data Domain system periodically rechecks the integrity of the RAID stripes and container logs.

    2. The Data Domain system uses the redundancy of the RAID system to heal any faults.

    3. During every read, data integrity is re-verified.

    4. Errors are healed as they are encountered.EMC Data Domain System Administration Course Student Guide 45

  • DIA Fault Avoidance and Containment

    The following is true for fault avoidance and containment:

    New data never puts old data at risk

    Container log never overwrites or updates existing data

    New data written in new containers

    The old containers and references remain in place and safe even in the face of software bugs or hard-ware faults that may occur when storing new backups

    New data never overrides good data

    Fewer complex data structures

    NVRAM for fast restarts

    No partial stripe writes46 Module 2: Data Domain Operating Environment

  • DIA Continuous Fault Detection and Healing

    Here is the flow for continuous fault detection and healing:

    1. The Data Domain system periodically rechecks the integrity of the RAID stripes and container logs

    2. Uses the redundancy of RAID system to heal any faults

    3. During every read, data integrity is re-verified

    4. Any errors are healed as they are encounteredEMC Data Domain System Administration Course Student Guide 47

  • DIA File System Recovery

    Here is the flow for file system recoverability:

    1. Data is written in a self describing format

    2. The file system can be recreated by scanning the logs and rebuilding it from meta data stored with the data48 Module 2: Data Domain Operating Environment

  • Lesson 4: File SystemsEMC Data Domain System Administration Course Student Guide 49

  • Administration File System

    The Data Domain system administrative file system is called ddvar:

    The NFS directory is /ddvar

    The CIFS directory is \ddvar

    This file system stores system core and log files.

    You cannot rename or delete this file system. Nor can you access all of its sub-directories, for example,

    the core sub-directory.

    Data streams for this file system change according to the Data Domain OS version and hardware model.

    Check the Data Domain support portal for more information on data streams for each OS and hardware

    model.50 Module 2: Data Domain Operating Environment

  • Storage File System

    /data/col1/ is the parent directory path under which all user data is retained.

    MTrees provide granular data management so that you can manage and report on different data types or

    data from different sources separately.

    For example, you can configure compression separately on different types of data in separate Mtrees.

    Note: You cant replicate under Mtrees.EMC Data Domain System Administration Course Student Guide 51

  • Storage File System (Continued)52 Module 2: Data Domain Operating Environment

  • Storage File System (Continued)

    You can add up to 14 MTrees /data/col1 to keep data separate. You can add subdirectories to MTree directories.

    You cannot add anything to the /data directory. You can only change to the col1 subdirectory.

    Subdirectories can be added under /data/col1/backup.

    The backup MTree (/data/col1/backup) cannot be deleted or renamed.

    If MTrees are added, they can be renamed and deleted.

    You can replicate under /backup

    Note: This slide shows the Data Domain system view, not the client view. The client view would not show

    /data/col1.EMC Data Domain System Administration Course Student Guide 53

  • Lesson 5: Data Domain Data Paths54 Module 2: Data Domain Operating Environment

  • Data Domain System in Typical Backup Environments

    Data Domain systems connect to backup servers as storage capacity to hold large collections of backup

    data.

    A Data Domain system integrates into typical backup environments non-intrusively.

    Often the Data Domain system is connected directly to the backup server.

    The backup data flow is redirected from the clients to the Data Domain device instead of to tape.

    If tape needs to be made for long term archival retention, data flows from the Data Domain device back to

    the server and then to tape, completing the same flow that the backup server was doing initially.

    Tapes come out in the same standard backup software formats as before and can go off site for long term

    retention.

    If a tape must be retrieved, it goes back into the tape library and the data flows back through the backup

    software to the client that needs it.EMC Data Domain System Administration Course Student Guide 55

  • Data Path over Ethernet

    Backup and archive media servers send data from clients to Data Domain systems on the network. A direct

    connection between a dedicated port on the backup or archive server, and you can also use a dedicated

    port on the Data Domain system.

    The connection between the backup or archive server and the Data Domain system can be Ethernet or

    Fibre Channel, or both if needed. This slide shows the Ethernet connection.

    Data is written to the backup file system on a Data Domain system.

    When Data Domain replicator is licensed on two Data Domain systems, replication is enabled between the

    two systems. The Data Domain systems can be either local for local retention or remote, for disaster recov-

    ery. Data in flight over the WAN can be secured using VPN.

    Physical separation of the replication traffic from backup traffic can be achieved by using two separate

    Ethernet interfaces on the Data Domain system. This allows backups and replication to run simultaneously

    without network conflicts.

    Since the Data Domain OS is based on Linux, it needs additional software to work with CIFS. Samba soft-

    ware enables CIFS to work with the Data Domain OS.56 Module 2: Data Domain Operating Environment

  • Data Path over Fibre Channel VTL

    If the Data Domain virtual tape library (VTL) option is licensed, and a VTL FC HBA is installed on the Data

    Domain system, the system can be connected to a Fibre Channel SAN. The backup or archive media server

    sees the Data Domain system as one or multiple VTLs with up to 256 virtual LTO-1, LTO-2, or LTO-3 tape

    drives and 20,000 virtual slots across up to 100,000 virtual cartridges. EMC Data Domain System Administration Course Student Guide 57

  • Lesson 6: Administration Interfaces58 Module 2: Data Domain Operating Environment

  • Access Enterprise Manager

    With the Enterprise Manager, you can manage 1 or more Data Domain systems.

    You can access the Enterprise Manager with many browsers, for example:

    Internet Explorer

    Chrome

    Firefox

    To login:

    1. In a browser enter the address

    2. Enter your user name and password

    3. Enter your system IP addressEMC Data Domain System Administration Course Student Guide 59

  • Enterprise Manager Main Screen60 Module 2: Data Domain Operating Environment

  • Enterprise Manager TabsEMC Data Domain System Administration Course Student Guide 61

  • CLI

    There are 4 ways to log into the CLI:

    1. SSH (PuTTY)

    2. Serial console

    3. Telnet (default is not enabled)

    4. Keyboard and monitor (KVM)

    Initial login information:

    login: sysadmin

    password: serial number on Data Domain model (box).62 Module 2: Data Domain Operating Environment

  • CLI (Continued)

    You can do everything in the CLI that you can do from the Enterprise Manager.

    You enter commands with options, for example:

    #command argument options

    #filesys show space

    #user show list

    #user add Bob

    #help admin access

    #help showEMC Data Domain System Administration Course Student Guide 63

  • Module Review64 Module 2: Data Domain Operating Environment

  • Module Review (Continued)EMC Data Domain System Administration Course Student Guide 65

  • Module Review (Continued)66 Module 2: Data Domain Operating Environment

  • Module 3: Initial Configuration and Backup

    Module IntroductionEMC Data Domain System Administration Course Student Guide 67

  • Module Objectives

    After completing after completing this module, you should be able to:

    Verify and perform an initial configuration

    Manage system access

    Configure CIFS/NFS

    Verify hardware

    Copy data to a Data Domain system68 Module 3: Configuration and Backup

  • Lesson 1: Verify Initial Configuration

    Your environment for this course assumes that your hardware installation is complete and your network

    connections are established. In this lesson you will learn how to verify or configure:

    Licenses

    Network settings

    System settings

    Protocols

    After you complete this lesson, you should be able to use the Enterprise Manager to perform an initial

    setup of the following:

    Licenses

    Network and system settings

    Protocols: CIFS, NFS, DD Boost (to be expanded in the DD Boost module). Currently, you cannot com-plete the configuration for the VTL protocol in the VDC.EMC Data Domain System Administration Course Student Guide 69

  • Launch Enterprise Manager

    To perform an initial configuration from the Enterprise Manager, do the following:

    1. Open the Enterprise Manager from an Internet Explorer browser, for example:

    http://dddev-01/ddem

    2. Enter your assigned Username and Password.

    1. Double-click on Login.70 Module 3: Configuration and Backup

  • Launch Configuration Wizard

    To launch the Configuration Wizard:

    1. From the Enterprise Manager, Click on Maintenance.

    2. Click on the More Tasks pull-down menu.

    3. Double-Click on Launch Configuration Wizard...

    4. Follow the Configuration Wizard prompts.

    You must follow the configuration prompts. You cant select an item to configure from the left-hand naviga-

    tion pane. You will be prompted to submit your configuration changes as you move through the wizard.

    You can also quit the wizard during your configuration.

    The CLI command is config setup.EMC Data Domain System Administration Course Student Guide 71

  • Lab 3.1: Initial Configuration72 Module 3: Configuration and Backup

  • Lesson 2: Manage System Access

    In a Data Domain system there are 2 user privilege types: admin and user. In this lesson, youll learn how

    to manage both.EMC Data Domain System Administration Course Student Guide 73

  • User Classes

    You may have to mange access for administrators and users on a Data Domain system. A Data Domain sys-

    tem supports 3 classes of access:

    1. Sysadmin (default account):

    Default administrative account.

    Cant be deleted.

    Creates1st security officer

    Has access to all GUI and CLI configuration, management, and monitoring commands.

    Can view all users.

    Can change the sysadmin password, but cant delete the account.

    Can enable/disable users with admin or user privileges, except sysadmin user.

    sysadmin is domain admin by defaul when you integrate with Active Directory

    2. Local user (user):

    Has access to GUI and CLI monitoring commands.74 Module 3: Configuration and Backup

  • Can view only their own account.

    Cant disable sysadmin.

    Cant make changes to the configuration.

    3. Security officer (user):

    Can enable, disable, modify, and delete other security officers.

    Can view all users.EMC Data Domain System Administration Course Student Guide 75

  • Manage Administration Access Protocols

    Use the Access Management page to configure and manage access protocols:

    Telnet access

    FTP access

    HTTP/HTTPS access

    SSH access76 Module 3: Configuration and Backup

  • Create a User

    To create new users, follow these steps:

    1. In the Navigational pane, expand the DD Network and select a system.

    2. Click the System Settings > Access Management> Local Users tabs. The Local Users view appears.

    3. Click the Create button to create a new user. The Create User dialog box appears.

    4. Enter the following information in the General Tab:

    Item Description

    User The user ID or name

    Password The user password. Set a default password, and the user can change it later.

    Verify Password The user password, again

    Privilege The privilege of the user: admin, security, or userEMC Data Domain System Administration Course Student Guide 77

  • The default value for the minimum length of a password or minimum number of character classes required for a user password is 1. Allowable character classes include:

    Lowercase letters (a-z)

    Uppercase letters (A-Z)

    Numbers (0-9)

    Special Characters ($, %, #, +, and so on)

    The available privileges display based on users privilege. Only the Sysadmin user can create the first

    security officer. After the first security officer is created, only security officers can create or modify other

    security officers. Sysadmin is the default admin user and cannot be deleted or modified.

    5. Enter the following information in the Advanced Tab:

    6. Click OK.

    The default password policy can change if the admin user changes them from the Modify Password Policy

    task. The default values are the initial default password policy values.

    Item Description

    Minimum Days Between Change The minimum number of days between password changes that you allow a user. Default is 0.

    Maximum Days Between Change The maximum number of days between password changes that you allow a user. Default is 99999.

    Warn Days Before Expire The number of days to warn the users before their password expires. Default is 7.

    Disable Days After Expire The number of days after a password expires to disable the user account. Default is Never.

    Disable account on the following date

    Check this box and enter a date (mm/dd/yyyy) on which you want to disable this account. Also, you can click on the calendar to select a date.78 Module 3: Configuration and Backup

  • Lesson 3: Configure CIFS/NFSEMC Data Domain System Administration Course Student Guide 79

  • Configure a CIFS share

    To configure a CIFS share, you must:

    1. Configure the workgroup mode

    2. Configure the Active Directory mode

    3. Give a descriptive name for the share

    4. Enter the path to the target directory (for example, /data/col1/mtree1)80 Module 3: Configuration and Backup

  • Configure an NFS Export

    To configure an NFS export, do the following:

    1. Enter a path name for the export

    2. In the Clients area, select an existing client or click the + icon to create a client

    The Clients dialog box appears.

    a. Enter a server name in the text box.

    Enter fully qualified domain names, host names, or IP addresses. A single asterisk (*) as a wild card indicates that all backup servers are to be used as clients.

    Clients given access to the /data/col1/backup directory have access to the entire directory. A client given access to a subdirectory of /data/col1/backup has access only to that subdirectory.

    A client can be a fully-qualified domain host name, class-C IP addresses, IP addresses with either netmasks or length, an NIS netgroup name with the prefix @, or an asterisk (*) wildcard with a domain name, such as *.yourcompany.com.

    A client added to a subdirectory under /data/col/backup has access only to that subdirectory.

    Enter an asterisk (*) as the client list to give access to all clients on the network. EMC Data Domain System Administration Course Student Guide 81

  • b. Select the check boxes of the NFS options for the client.

    - Read-only permission.

    - (Default) Requires that requests originate on a port that is less than IPPORT_RESERVED (1024).

    - Map requests from UID or GID 0 to the anonymous UID or GID.

    - Map all user requests to the anonymous uid or gid.

    - Use default anonymous UID or GID. 82 Module 3: Configuration and Backup

  • Lesson 4: Verify Hardware

    As part of your Data Domain initial setup, you should verify that your hardware is correctly installed and

    running. This lesson teaches you how to do that.

    The CLI command is system show status.EMC Data Domain System Administration Course Student Guide 83

  • Model Number, System Uptime, Serial Number

    To verify your model number, system uptime, and serial number in the Enterprise Manager:

    1. Select a system from the left-hand navigation pane.

    2. Click the Maintenance tab.

    3. Verify the information.

    The CLI commands are:

    1. system show status

    2. system show config

    3. system disk show84 Module 3: Configuration and Backup

  • Storage (Disk) Status

    To verify your storage status in the Enterprise Manager:

    1. Click the Hardware tab

    2. View the Storage Status

    If the Status indicator Storage operational is green, all disks are good. If the Status indicator Stor-age operational is yellow, the system is working, but there is a problem. I the Status indicator Stor-age operational is red, the system isnt operational.EMC Data Domain System Administration Course Student Guide 85

  • Disk Overview

    To get an overview of your storage (disk) status in the Enterprise Manager:

    1. Select a system from the left-had navigation pane.

    2. Click the Hardware tab.

    3. Click Storage.

    From here you can view the Active Tier, Usable Disks, and Failed\Foreign\Absent Disks.86 Module 3: Configuration and Backup

  • Active Tier

    Disks in the Active Tier are currently marked as usable by the Data Domain file system. Sections are orga-

    nized by Disks in Use and Disks Not in Use. If the optional archive feature is installed From the Storage Sta-

    tus Overview pane, you can expand your view of the disk use in the Active Tier. You can view for both Disks

    In Use and Disks Not In Use:

    Disk Group: here it is dg0

    Status. Here it is 1 Normal.

    Disk Reconstructing

    Total Disks

    Disks

    You can also expand the Disks In Use view to view individual disks. Once you View Details, you can Beacon

    individual disks.

    The CLI command is storage showEMC Data Domain System Administration Course Student Guide 87

  • Locate a Disk

    To locate a disk (for example, when a failed disk needs to be replaced):

    1. Select the Data Domain system in the Navigational pane.

    2. Click the Hardware > Storage > Disks tabs.

    The Disks view appears.

    3. Select a disk from the Disks table and click Beacon.

    The Beaconing Disk dialog window appears, and the LED light on the disk begins flashing.

    4. Click Stop to stop the LED beaconing.88 Module 3: Configuration and Backup

  • View Usable Disks

    Usable disks are those that arent incorporated into the file system yet. To view details about usable disks

    from the Enterprise Manager:

    1. Select a system from the left-hand navigation pane.

    2. Click the Hardware tab.

    3. View the status, which includes the disk:

    Name

    Status

    Size

    Manufacturer/Model

    Firmware

    Serial Number

    You can also view the details of individual disks. The CLI command is disk show status.EMC Data Domain System Administration Course Student Guide 89

  • View Failed, Foreign, or Absent Disks

    To get the status on Failed/Foreign/Absent Disks in the Enterprise Manager:

    1. Select a system from the left-hand navigation pane.

    2. Open the Failed/Foreign/Absent Disks panel.

    3. View the following disk information:

    Name

    Status

    Size

    Manufacturer/Model

    Firmware

    Serial Number

    4. You can also view the details of individual disks.90 Module 3: Configuration and Backup

  • View Chassis Status

    To view your chassis status in the Enterprise Manager:

    1. From the left-hand navigation pane, select a system.

    2. Click the Hardware tab.

    3. Click Chassis.

    From here you can view the following by hovering your mouse over them:

    NVRAM

    PCI Slots

    SAS

    Power Supply

    PS FAN

    Riser Expansion

    TemperatureEMC Data Domain System Administration Course Student Guide 91

  • FANs

    FRONT and BACK chassis views

    The CLI command is system show status.92 Module 3: Configuration and Backup

  • Lab 3.2: Copy Data to a Data Domain SystemEMC Data Domain System Administration Course Student Guide 93

  • Module Review94 Module 3: Configuration and Backup

  • Module 4: Manage Network Interfaces

    Module Introduction and Objectives

    After completing this module, you should be able to:

    Manage network interfaces, settings, and routes

    Describe and use link aggregation.

    Describe and use link failover.

    Describe and use VLAN tagging.

    Describe and use IP aliases.EMC Data Domain System Administration Course Student Guide 95

  • Lesson 1: Interfaces, Settings, & Routes

    Use the Hardware > Network > Settings view to view and configure network settings. 96 Module 4: Manage Network Interfaces

  • Manage Network Routes

    Routes determine the path taken to transfer data to and form the Data Domain system to another network

    or host. To set the default gateway:

    1. From the Navigational pane, select the Data Domain system to configure.

    2. Click the Hardware > Network > Routes tabs.

    3. Click Edit in the Default Gateway area.

    The Configure Default Gateway dialog box appears.

    4. Choose how the gateway address is set. Either:

    Select Use DHCP value radio button for setting the gateway.

    Dynamic Host Configuration Protocol (DHCP) indicates if the gateway is configured using value from DHCP server.

    Select the Manually Configure radio button.

    The Gateway address box becomes available.

    Enter the gateway address in the Gateway field.EMC Data Domain System Administration Course Student Guide 97

  • 5. Click OK.

    The system processes the information and returns you to the Routes tab. 98 Module 4: Manage Network Interfaces

  • Create Static Routes

    1. From the Navigational pane, select the Data Domain system to configure.

    2. Click the Hardware > Network > Routes tabs.

    3. Click Create in the Static Routes area

    The Create Routes dialog box appears.

    4. Select an interface to configure for the static route.

    a. Click the check boxes for the interface(s) whose route you are configuring.

    b. Click Next.

    5. Specify the Destination. Select either of the following.

    The Network Address and Netmask.

    a. Click the Network radio button.

    b. Enter destination information, by providing destination network address and netmask. EMC Data Domain System Administration Course Student Guide 99

  • Note: This is not the IP of any interface. The interface is selected in the initial dialog and it is used for routing traffic.

    The Host Name or IP address of host destination.

    a. Click the Host radio button.

    b. Enter the Host Name or IP address of the destination host to use for the route.

    6. Optionally, change the gateway for this route.

    a. Click the check box, Specify different gateway for this route.

    b. Enter a gateway address in the Gateway field.

    7. Review changes, click Next.

    The Create Routes > Summary page appears. The values listed reflect the new configuration.

    8. Complete the action, click Finish.

    Progress messages display. When changes are applied, the message indicates Completed. Click OK to close the dialog.

    The new route specification is listed in the Route Spec list.100 Module 4: Manage Network Interfaces

  • Lesson 2: Link Aggregation

    Definition

    Using multiple Ethernet network cables, ports, interfaces (links) in parallel, link aggregation increases net-

    work throughput, across a LAN or LANs until the maximum computer speed is reached. Data processing

    becomes faster than when data is sent over individual links. For example, you can enable link aggregation

    on a virtual interface (veth1) to a physical interfaces (eth0a and eth0b) in the link aggregation control pro-

    tocol (LACP) mode and hash XOR-L2.

    Link aggregation evenly splits network traffic across all links or ports in an aggregation group. It does this minimum with impact to the splitting, assembling, and reordering of out-of-order packets.

    Aggregation is between 2 directly attached systems (point-to-point and physical or virtual). Normally the aggregation is between the local system and the network device or system that is connected. Nor-mally a Data Domain system is connected to a switch or router.

    Aggregation is handled between the IP layer (L3 and L4) and the mac layer (L2) network driver.

    Link aggregation performance is impacted by the following:EMC Data Domain System Administration Course Student Guide 101

  • Switch speed: normally the switch can handle the speed of each link that is connected to it, but it may lose some packets if all of the packets are coming from several ports that are concentrated on one uplink running at maximum speed. In most cases, this means that you can use only 1 switch for port aggregation coming out of a Data Domain system. Some network topologies allow for link aggregation across multiple switches.

    How much the Data Domain system can process

    Out-of-order packets: a network program must put out-of-order packets back to the original order. If the link aggregation mode allows the packets to be sent out-of-order, and the protocol requires that they be put back to the original order, the added overhead may impact the throughput speed enough so that the link aggregation mode that caused the out-of-order packets shouldnt be used.

    Number of clients: in most cases, either the physical of OS resources cant drive data at multiple Gbps. Also, due to hashing limits, youd need multiple clients to push data at multiple Gbps.

    n Number of streams (connections) per client can significantly impact link utilization depending on the hashing that you use.

    A Data Domain system supports 2 aggregation methods:

    1. Round robin

    2. Balanced-xor (you set up manually on both sides)

    Requirements

    Links can only be part of 1 group.

    Aggregation is between 2 systems.

    All links in a group have the same speed.

    All links in a group are half-duplex or full-duplex.

    No changes to the network headers are allowed.

    You must have a unique address across aggregation groups.

    Frame distribution must be predictable and consistent.102 Module 4: Manage Network Interfaces

  • Create a Virtual Interface for Link Aggregation

    To create a link aggregation virtual interface:

    1. From the navigation pane, select a Data Domain system.

    2. Click Hardware > Network > Interfaces.

    3. Under the Interfaces tab, disable the physical interface where you want to add the virtual interface.

    Select No from the Enabled pull-down menu.

    4. From the Interface Type pull-down menu, select Virtual Interface.

    The Create Virtual Interface dialog box appears.

    5. Specify a virtual interface name in the veth text box.

    Enter a virtual interface name in the form vethx, where x is a unique ID (typically 1 or 2 digits). A typical full virtual interface name with VLAN and IP alias in veth56.3999.199. The maximum length of the full name is 15 characters. Special characters are not allowed. Numbers must be between 0 and 9999.

    6. Select Aggregate from the bonding Type drop-down list.EMC Data Domain System Administration Course Student Guide 103

  • Registry setting can be different from the bonding configuration. When you add interfaces to the vir-tual interface, the information isnt sent to the bonding module until the virtual interface is brought up. Until that time the registry and the bonding driver configuration are different.

    7. From the General tab, specify the Bonding Mode.

    Specify a bonding mode thats compatible with the system requirements to which the interfaces are directly attached. The available modes are:

    Round robin: transmits packets in sequential order from the 1st available link through the last in the aggregated group.

    Balanced: Data sent over interfaces as determined by the hash method yo select. Associated inter-faces on the switch must be grouped into an Ether channel (trunk).

    LACP: Similar to Balanced except it has a control protocol that communicates with the other end and coordinates what links, within the bond, are available. It provides heartbeat failover.

    8. Specify the bonding hash.

    In the General tab, from the Bonding Hash pull-down menu select either Layer 2 (L2) or Layer 3/Layer 4 (L3L4).

    9. Select an interface to add to the aggregate configuration by click in the check ox corresponding to the interface.

    10. Click Next.

    The Create virtual interface veth_name dialog box appears.

    11. Enter an IP address.

    12. Enter a Netmask address.

    The Netmask is the subnet portion of the IP address that is assigned to the interface.

    The format is usually 255.255.255.XXX, where XXX is the value that identifies the interface. If you dont specify a netmask, the Data Domain system uses the netmask format as determined by the TCP/IP address class (A, B, C) that you are using.

    13. Specify speed and duplex options.

    Note: Aggregation isnt available for NICs.

    The combination of speed and duplex settings define the rate of data transfer through the interface.

    Layer 2 (XOR-L2) Transmit based on static balanced and LACP mode aggregation with

    an XOR hash or layer 2 (inbound and outbound MAC addresses).

    Layer 2/Layer 3

    (XOR-L2L3)

    Transmit based on static balanced and LACP mode aggregation with

    an XOR has of layer 2 (inbound and outbound IP address) and layer

    3 (inbound and outbound interface numbers).

    Layer 3/Layer 4

    (XOR-L3L4)

    Transmit based on static balanced and LACP mode aggregation with

    an XOR hash of Layer 3 (inbound and outbound IP address) and

    Layer 4 (inbound and outbound interface numbers).104 Module 4: Manage Network Interfaces

  • Autonegotiate Speed/Duplex: select this option to allow the NIC to autonegotiate the line speed and duplex setting for an interface.

    Manually configure Speed/Duplex: select this option to manually set an interface data transfer rate.

    Duplex options are half-duplex or full-duplex.

    Speed options are limited to the capabilities of the hardware. The option are 10 Base-T, 100 Base-T, 1000 Base-T (Gb), and 10,000 (10 Gb).

    Half-duplex is only available in 10 base-T and 100 Base-T speeds.

    1000 and 10,000 line speeds require full-duplex.

    Optical interfaces require the Autonegotiate option.

    The copper interface default is 10 Gb. If a copper interface is set to 1000 or 10,000 line speed, duplex must be full-duplex.

    14. Specify maximum transfer unit (MTU) settings.

    This sets the size for the physical (Ethernet) interface. Supported values are from 350 to 9014. For 100 base-T gigabit networks, 1500 is the default.

    15. Click the Default button to return the setting to the default value.

    16. Ensure that all of your network components support the size set with this options.

    17. Select Dynamic Registration (optional).

    The dynamic DNS (DDNS) protocol enable machines on a network to communicate with and register IP addresses on a Data Domain system DNS server.

    The DDNS must be registered to enable this option.

    18. Click Next.

    The Configure Interface Settings summary page appears.

    19. Ensure that the values listed are correct.

    20. Click Finish

    21. Click OK.EMC Data Domain System Administration Course Student Guide 105

  • Lesson 3: Link Failover

    Link failover increases network throughput by keeping backups operational during network glitches.

    Link failover is supported by a bonding driver on a Data Domain system. The bonding driver checks the car-

    rier signal on the active interface every 9 seconds. If the carrier signal is lost, the active interface is

    changed to another standby interface. An ARP is sent to indicate that the data must flow to the new inter-

    face. The interface can be:

    On the same switch

    On a different switch

    Directly connected

    Specifications

    Only 1 interface in a group can be active at a time.

    Data flows over the active interface. Non-active interfaces can receive data.

    You can specify a primary interface. If you do specify a primary interface, it will be the active interface if its available.106 Module 4: Manage Network Interfaces

  • Bonded interfaces can go to the same or different switches.

    You do not have to configure a switch to make link failover work.

    1 GbE interface

    You can put 2, or more, interfaces in a link failover bonding group.

    The bonding interfaces can be:

    On the same card

    Across cards

    Between a card and a motherboard

    Link failover is independent of the interface type. For example, copper and optical can be failover links if the switches support the connections.

    10 GbE interface

    You can put only 2 interfaces in a failover bonding group.

    The bonding interfaces can only be on the same card.EMC Data Domain System Administration Course Student Guide 107

  • Manage Link Failover108 Module 4: Manage Network Interfaces

  • Manage Link Failover (Continued)EMC Data Domain System Administration Course Student Guide 109

  • Lesson 4: Manage VLAN and IP Alias Network Interfaces

    Introduction and Objectives

    In this lesson youll learn about VLAN and IP alias network interfaces and how to manage it through the

    Enterprise Manager.110 Module 4: Manage Network Interfaces

  • Define VLAN and IP Alias Network Interfaces

    Virtual local area networks (VLANs) manage subnets on a network. VLANs enable a LAN to transcend phys-

    ical boundaries. They enable you to segregate network broadcasting. They are used to:

    Provide security

    Speed up network traffic

    Organize network LANs.

    If youre not using VLANs, you can use IP aliases. Ip aliases are easy to implement and are less expensive

    than VLAN, but they are not a true VLAN. For example, you must use 1 IP address for management and

    another IP address to backup or archive data. You can combine VLANs and IP aliases.EMC Data Domain System Administration Course Student Guide 111

  • Create VLAN and IP Aliases

    VLAN tag insertion (VLAN tagging) enables you to create multiple VLAN segments. (You get the tags from the network administrator.)

    In a Data Domain system, you can have up to 4096 VLAN tags.

    Create a new VLAN interface from either a physical interface or a virtual interface.

    The recommended total number that can be created is 80, though it is possible to create up to 100 inter-

    faces before the system is affected.112 Module 4: Manage Network Interfaces

  • Create VLAN and IP Aliases (Continued)

    To create a VLAN tag from the Enterprise Manager:

    1. From the Navigational pane, select the Data Domain system to configure.

    2. Click the Hardware > Network > Interfaces tabs.

    3. Click Create and select the VLAN option.

    The Create VLAN dialog box appears.

    4. Specify a VLAN ID by entering a number in the ID field.

    The range of a VLAN ID is between 1 and 4095.

    5. Enter an IP Address.

    The Internet Protocol (IP) Address is the numerical label assigned to the interface. For example, 192.168.10.23

    6. Enter a Netmask address.

    The Netmask is the subnet portion of the IP address that is assigned to the interface. The format is typ-ically 255.255.255.###, where the ### are the values that identify the interface. If you do not specify a netmask, the Data Domain system uses the netmask format is determined by the TCP/IP address class (A,B,C) you are using.

    7. Specify MTU Settings.EMC Data Domain System Administration Course Student Guide 113

  • This sets the maximum transfer unit (MTU) size for the physical (Ethernet) interface. Supported values are from 350 to 9014. For 100 Base-T and gigabit networks, 1500 is the standard default.

    Notes:

    Click the Default button to return the setting to the default value.

    Ensure that all of your network components support the size set with this option.

    8. Specify Dynamic DNS Registration option.

    Dynamic DNS (DDNS) is the protocol that allows machines on a network to communicate with, and reg-ister their IP address on, a domain name system (DNS) server. The DDNS must be registered to enable this option. Refer to Registering a DDNS for additional information.

    9. Click Next.

    The Configure Interface Settings summary page appears. The values listed reflect the new system and interface state.

    10. Click Finish and OK.114 Module 4: Manage Network Interfaces

  • Create VLAN and IP Aliases (Continued)

    Create a new IP Alias interface from either a physical interface or a virtual interface.

    The recommended total number of IP Aliases and virtual interfaces that can be created is 80 though it is possible to create up to 100 interfaces.

    1. From the Navigational pane, select the Data Domain system to configure.

    2. Click the Hardware > Network > Interfaces tabs.

    3. Click Create and select the IP Alias option.

    The Create IP Alias dialog box appears.

    4. Specify a IP Alias ID by entering a number in the eth0a field.

    Requirements are: 1 to 4096

    5. Enter an IP Address. The Internet Protocol (IP) Address is the numerical label assigned to the inter-face. For example, 192.168.10.23

    6. Enter a Netmask address.

    The Netmask is the subnet portion of the IP address that is assigned to the interface.

    Format is typically 255.255.255.###, where the ### are the values that identify the interface. If you do not specify a netmask, the Data Domain system uses the netmask format is determined by the TCP/IP address class (A,B,C) you are using. EMC Data Domain System Administration Course Student Guide 115

  • 7. Specify Dynamic DNS Registration option.

    Dynamic DNS (DDNS) is the protocol that allows machines on a network to communicate with, and register their IP address on, a Domain Name System (DNS) server.

    The DDNS must be registered to enable this option. Refer to Registering a DDNS for additional infor-mation.

    Click Next. The Configure Interface Settings summary page appears. The values listed reflect the new system and interface state.

    Click Finish and OK.116 Module 4: Manage Network Interfaces

  • Module ReviewEMC Data Domain System Administration Course Student Guide 117

  • Module Review (Continued)118 Module 4: Manage Network Interfaces

  • Module Review (Continued)EMC Data Domain System Administration Course Student Guide 119

  • 120 Module 4: Manage Network Interfaces

  • Module 5: Manage a VTL

    Module Introduction and ObjectivesEMC Data Domain System Administration Course Student Guide 121

  • VTL Overview

    In some environments the Data Domain system must be configured as a Virtual Tape Library. This practice

    may be motivated by the need to leverage existing backup policies that were built around using physical

    tape libraries. Using a VTL can also be a step in a longer range migration to disk based media for backup,

    or it may be driven by the need to minimize the effort to recertify a system to meet compliance needs.

    Virtual tape libraries emulate the physical tape equipment and function. Different tape library products

    may package some things in different ways and the names of some elements may differ between products

    but the fundamentals are basically the same. Data Domain systems are configured using the concepts of

    libraries, tapes, Cartridge Access Ports, and barcodes.

    The Data Domain VTL software option requires installation of a Data Domain VTL HBA to connect to a Fibre

    Channel storage network. Activating the VTL configuration also requires the installation of a VTL license

    key.

    The Data Domain VTL software option requires installation of a Data Domain VTL HBA to connect to a Fibre

    Channel storage network. Activating the VTL configuration also requires the installation of a VTL license

    key.122 Module 5: Manage a VTL

  • The goal of configuring a Data Domain system as a VTL is to provide an interface that the backup software

    can work with as if it were working with the physical tape library. Some Parameters must be configured in

    the VTL environment to structure the libraries and the number and size of elements within each. Many of

    these parameters are dictated by the tape technology and library model that is being emulated. Consult

    the product documentation and best practice guides for more information about the definitions and

    ranges for each parameter.EMC Data Domain System Administration Course Student Guide 123

  • Configuration Terms124 Module 5: Manage a VTL

  • VTL Planning

    You must configure parameters in the VTL environment to structure the number and size of elements

    within each library. Parameters are dictated by the tape technology and library you are emulating. Consult

    the product documentation and best practices guides for information about the definitions and ranges for

    each parameter. EMC Data Domain System Administration Course Student Guide 125

  • Capacity Planning

    See the best practices documentation for planning a virtual library configuration.

    Capacity calculations are a function of the backup set size and the compression ratios.

    Consider how space gets allocated and recovered in events like tape creation and expiration.

    With deduplication the amount of disk space you need for a tape depends on the compression ratio. When virtual tapes are created space isnt pre-allocated, so the logical setup to emulate a large library of tape incurs a small disk space. You monitor disk space use and compression using the same tech-niques you use for monitoring data backed up over NFS and CIFS

    You must clean a Data Domain system to reclaim data storage space from an expired tape image. DIA requirements to not use space until cleaning runs also means that large amounts of space may become unavailable for use. In some environments, you must manage disk use closely. You may need to run system cleaning frequently.126 Module 5: Manage a VTL

  • Create TapesEMC Data Domain System Administration Course Student Guide 127

  • VTL Barcode Definition

    When a tape is created, a bar code is assigned that is a unique identifier of a tape. The eight-character barcode must start with six numeric or upper-case alphabetic characters (from the set {0-9, A-Z}) and end in a two-character tag for the supported LT0-1, LT0-2, and LT0-3 tape type. 128 Module 5: Manage a VTL

  • Configure VTL Barcode

    Data Domain recommends creating tapes with unique barcodes only. Duplicate bar codes in the same tape pool create an error. Although no error is created for duplicate barcodes in different pools, duplicate bar codes may cause unpredictable behavior in backup applications. EMC Data Domain System Administration Course Student Guide 129

  • Configure VTL Barcode (Continued)130 Module 5: Manage a VTL

  • Configure VTL Barcode (Continued)EMC Data Domain System Administration Course Student Guide 131

  • Module Review132 Module 5: Manage a VTL

  • Module 6: Manage Data

    Module Introduction and ObjectivesEMC Data Domain System Administration Course Student Guide 133

  • Lesson 1: Snapshots

    Use a snapshot to save a backup directory copy at a specific point in time. You can use it later to restore files from a specific point in time, for example, before a Data Domain OS upgrade.

    Snapshot benefits:

    Snapshots do not use many system resources

    Snapshots are free (do not require a license)

    A snapshot is a read-only copy of the entire backup directory.

    You can schedule multiple snapshot schedules at the same time or create them when you choose, for example, before you upgrade or configure your system.

    Snapshots are created in the /backup/.snapshot directory.

    The maximum number of snapshots allowed to be stored on a Data Domain system is 750. You will receive warning when the number of snapshots reaches 90% of the allowed number (675-749). An alert is generated when the maximum number is reached. To clear the warning:

    1. Expire snapshots134 Module 6: Manage Data

  • 2. Run file system cleaning

    If the Data Domain system is a source for collection, snapshots are replicated. If the Data Domain system

    is a source for directory replication, snapshots are not replicated. You must create and replicate snapshots

    separately.

    Note: Use snapshots judiciously to avoid having segments in snapshots that remain after files are deleted.EMC Data Domain System Administration Course Student Guide 135

  • Manage Snapshots

    To create a snapshot:

    1. Data Management > Snapshots

    2. Click Create to create snapshot

    3. Click Modify Expiration Date to choose

    Never Expire

    In: snapshot expires in number of days, weeks, months, or years you select

    On: snapshot expires at the first minute of the day you select

    4. Select Schedule tab to schedule snapshots136 Module 6: Manage Data

  • Lab 6.1 Configure SnapshotEMC Data Domain System Administration Course Student Guide 137

  • Lesson 2: Fastcopy

    Use fastcopy to retrieve data stored in snapshots. Fastcopy copies files and directory trees of a course directory to a target directory on a Data Domain system. The fastcopy force option allows the destination directory to be overwritten if it exists. Fastcopy makes a destination equal to the source, but not at a particular point in time. The source and destination may not be equal if you change either during the copy operation. Fastcopy takes space (its like a clone). 138 Module 6: Manage Data

  • Perform Fastcopy

    1. Data Management > File System > More Tasks >

    2. Enter source and destination

    3. Click box to overwrite exiting destination (if it exists)EMC Data Domain System Administration Course Student Guide 139

  • Lab 6.2: Perform Fastcopy140 Module 6: Manage Data

  • Lesson 3: Retention Lock

    The EMC Data Domain Retention lock licensed software feature enables organizations to protect records in

    non-writeable and non-erasable formats for a specified length of time, up to 70 years. Retention lock pro-

    tects against:

    Accidents and user errors

    Malicious activity

    Licensed feature EMC Data Domain System Administration Course Student Guide 141

  • Retention Lock (Continued)

    Minimum and maximum retention periods are set globally

    Within the global parameters, retention parameters can be set on a file-by-file basis

    Retention Lock is designed to respond to commands from the user, archive software, or some backup

    applications, which trigger the lock on stored data.

    Assumes administrators are trustworthy

    Allows administrative users to do the following

    Override global retention settings

    Update permissions of locked files

    Rename empty directories

    Fully integrated with Data Domain replication

    File locking is initiated by the user or by backup or archive software142 Module 6: Manage Data

  • Retention Lock (Continued)

    The user or the storage software needs to set the access time (atime) for the file to a date in the future that

    must be beyond the minimum retention period that is configured on the Data Domain system. The action

    of setting the atime is the signal to lock the file. As soon as this value is set, the file is locked and cannot

    be deleted or modified before that date.

    Retention Lock extends the utility of Data Domain systems into environments that require granular con-

    trols over the protection of individual files stored on the system. EMC Data Domain System Administration Course Student Guide 143

  • Configure Retention Lock144 Module 6: Manage Data

  • Lesson 3: Sanitization

    With the sanitize function, deleted files can be overwritten using a DoD/NIST compliant algorithm and pro-

    cedures. No complex setup or disruption is needed. Clean data is available during the sanitization pro-

    cess, with limited disruption to daily operations. Sanitizing is the electronic equivalent of data shredding.

    It removes any trace of deleted files.

    Sanitization supports organizations that:

    Are required to remove and destroy confidential data that is accidentally written to unapproved sys-tems

    Are required to delete data that is no longer needed.

    Need to resolve classified message incidents (CMIs).

    The system sanitize command erases the following:

    Segments of deleted files that are not used by other files

    Contaminated metadata

    All unused capacity in the file systemEMC Data Domain System Administration Course Student Guide 145

  • All segments used by deleted file cannot be globally erased, because some segments may be used by

    other files

    CLI command: system sanitize, system sanitize start146 Module 6: Manage Data

  • Lab 6.3: Configure SanitizationEMC Data Domain System Administration Course Student Guide 147

  • Lesson 4: Encryption of Data at Rest

    With encryption software option licensed and enabled, all incoming data is encrypted inline before it is

    written to disk. This is a software-based approach, and it requires no additional hardware. It includes:

    Configurable 128-bit or 256-bit Advanced Encryption Standard (AES) algorithm with either:

    Confidentiality with Cipher Block Chaining (CBC mode)

    Or

    Both confidentiality and message authenticity with Galios/Counter Mode (GCM mode)

    Encryption and decryption to and from disk is transparent to all access protocols: DD Boost, NFS, CIFS, and VTL (no administrative action is required for decryption)

    Passphrase and Encryption Key

    When you enable encryption, you are prompted for a passphrase. The system generates an encryption key and uses the passphrase to encrypt the key. One key is used to encrypt all data written to the system. After encryption is enabled, the passphrase is needed by administrative users only:148 Module 6: Manage Data

  • If locking or unlocking the file system

    If disabling encryption

    One key is used for all data in a system.

    The encryption key is passphrase protected.

    The administrative user specifies the passphrase when enabling encryption.

    The passphrase is needed by administrative users only:

    If locking or unlocking the file system

    If disabling encryption

    Take great care not to lose the passphrase or data can be irrevocably lost. If, for example, the passphrase is lost after a file system is locked, the file system cannot be unlocked. The passphrase is not stored on disk (so the passphrase cant be recovered). The two administrative users use the passphrase to lock the Data Domain system and its external storage devices. They enter a new passphrase during the locking procedure. A thief who steals a system cant unlock the file system without the passphrase.

    File System Locking

    After encryption is enabled, the file system can be locked by two administrative users who work together.

    To unlock the file system, one administrative user enters the new passphrase.

    Useful before a system is transported or a disk is replaced

    Without encryption, user data could possibly be recovered by a thief with a forensic tool (particularly if local compression is turned off)

    If a file system is locked before a disk is shipped, the disk is protected from forensic analysis because:

    The encryption key (which is stored on disk encrypted by the passphrase) cant be recovered

    Any user data is encrypted (and the data cant be decrypted without the key)

    Only an administrative user who has the new passphrase created during the locking procedure can unlock a file systemEMC Data Domain System Administration Course Student Guide 149

  • Encryption Flow150 Module 6: Manage Data

  • Configure Encryption

    1. In the CLI, ensure the Data Domain Encryption license is enabled.

    (See Appendix A for Data Domain software licenses.)

    Disable the file system

    # filesys disable

    Enable encryption

    # filesys encryption enable

    2. Enter a passphrase when prompted

    Select an alternative cryptographic algorithm (optional)

    # filesys encryption algorithm set algorithm

    Default algorithm is aes_256_cbc. Other options are: aes_128_cbc, aes_128_gcm, or aes_256_gcmEMC Data Domain System Administration Course Student Guide 151

  • Apply Encryption Changes152 Module 6: Manage Data

  • Deactivate EncryptionEMC Data Domain System Administration Course Student Guide 153

  • Lab 6.4 Configure Encryption154 Module 6: Manage Data

  • Lesson 5: File System Cleaning

    Cleaning reclaims physical storage occupied by deleted objects. For example, as retention periods expire,

    old backups are deleted. Space from deleted backups becomes available only after cleaning reclaims the

    disk space.

    When application software expires backup or archive images, they are deleted in the sense that they are

    no longer accessible or available for recovery from the application. The images still occupy physical stor-

    age. The clean operation reclaims the segments used by files that are deleted and are not longer refer-

    enced.

    Cleaning may require a lot of system resources. Mechanisms (self-throttling) are in place to automatically

    adjust the priority assigned to cleaning tasks in favor of more time critical processing tasks. Cleaning

    schedules are adjustable. By default, cleaning is scheduled to start every Tuesday at 6:00 am. You should

    schedule cleaning for times when system traffic is lowest.

    A Data Domain system is available for write and read operations during cleaning.EMC Data Domain System Administration Course Student Guide 155

  • Cleaning

    Cleaning enables you to reorganize data to improve the speed and efficiency of deduplication.

    Data invulnerability requires that data is only written into new containers. This requirement also applies to

    cleaning. Copy forward segments are segments that, for deduplication efficiency, should be stored adja-

    cent to each other. So they are copied forward together into a single container.

    Dead segments are dead because the files that referred to them were all deleted, and the pointers were

    removed. Dead segments are not allowed to be rewritten with new data since this could put valid data at

    risk of corruption. Instead, valid segments are copied forward into free containers to group the remaining

    valid segments together. When the data is safe and reorganized, the original containers are appended

    back onto the available disk space.

    Since DDS is a log structure file system, space that was deleted (see types below) must be reclaimed. This

    is called cleaning.

    Cleaning is done for 2 main reasons:

    1. House keeping: periodically, segments are considered dead if the reference file for it was deleted. 156 Module 6: Manage Data

  • 2. Performance tuning: approximately 10% of the duplicate data is rewritten on the disk. This is done for performance reasons. For example, rewriting duplicate segments (copy forward segment) into the same location (segmentation locality) can be more advantageous than having the segments across different locations.

    The default time schedule for file system cleaning is every Tuesday at 6 am. The default CPU throttle is

    50%. You can change these options using the Enterprise Manager or the CLI.EMC Data Domain System Administration Course Student Guide 157

  • Lab 6.5: Configure File-System Cleaning158 Module 6: Manage Data

  • Lesson 6: Monitor File-System Space UsageEMC Data Domain System Administration Course Student Guide 159

  • File System Summary Tab160 Module 6: Manage Data

  • Space Usage

    Monitor space use to determine if you have enough space (through extrapolation). If you dont have

    enough space you can do the following:

    Increase capacity

    Reduce retention periods

    Reduce amount of data

    Space Usage Terms

    Term GUI Term Definition

    pre-compression Pre Compression Written Data sent to Data Domain system before deduplication, local compression, or both

    post compression used Post-Comp Used Space used after compressionEMC Data Domain System Administration Course Student Guide 161

  • compression factor Comp Factor The compression factor (global ratio, cumula-tive ratio) is amount of data footprint reduc-tion

    It is pre-compression data divided by data col-lection

    It is a global setting

    cleaning Cleaning Cleaning date

    Notice compression factor increase after each cleaning

    Term GUI Term Definition162 Module 6: Manage Data

  • Space Consumption Tab

    Space Consumption Terms

    Term GUI Term Definition

    capacity Capacity

    post-compression Post-Comp

    compression factor Comp Factor The compression factor (global ratio, cumulative ratio) is amount of data footprint reduction

    It is pre-compression data divided by data collection

    It is a global setting

    cleaning Cleaning cleaning date

    data movement Data MovementEMC Data Domain System Administration Course Student Guide 163

  • Daily Written Tab

    Daily Written Tab Terms

    Term GUI Term Definition

    pre-compression Pre-Comp

    post-compression Post-Comp

    total compression factor Total-Comp Factor

    global compression factor Global-Comp Factor

    local compression factor Local-Comp Factor

    total-compression factor (reduction)

    Total-Comp Factor (Reduction)164 Module 6: Manage Data

  • Module ReviewEMC Data Domain System Administration Course Student Guide 165

  • Module Review (Continued)166 Module 6: Manage Data

  • Module 7: Manage Data Replication and Recovery

    Module Introduction and ObjectivesEMC Data Domain System Administration Course Student Guide 167

  • Lesson 1: Data Replication

    Because replication duplicates data over a WAN after its deduplicated (only unique data segments are

    sent over the network) and compressed, network demands are reduced. Replicate (copy) data from one

    Data Domain system to another for:

    Disaster recovery

    Remote office data protection

    Multiple site tape consolidation

    Onsite archiving

    Once you configure replication between a source and destination, only new data written to the source is

    automatically replicated to the destination. Data is deduplicated at the source and at the destination. You

    can recover offsite replicated data online. So, you dont need to transport tape via remounting or by truck.

    You need a replicator license for both source and destination Data Domain systems. See appendix A for a

    complete list of Data Domain licenses.168 Module 7: Manage Data Replication and Recovery

  • Lesson ObjectivesEMC Data Domain System Administration Course Student Guide 169

  • Data Replication Overview

    Because replication duplicates data over a WAN after its deduplicated (only unique data segments are

    sent over the network) and compressed, network demands are reduced. Replicate (copy) data from one

    Data Domain system to another for:

    Disaster recovery

    Remote office data protection

    Multiple site tape consolidation

    Onsite archiving

    Once you configure replication between a source and destination, only new data written to the source is

    automatically replicated to the destination. Data is deduplicated at the source and at the destination. You

    can recover offsite replicated data online. So, you dont need to transport tape via remounting or by truck.

    You need a replicator license for both source and destination Data Domain systems. See appendix A for a

    complete list of Data Domain licenses.170 Module 7: Manage Data Replication and Recovery

  • Data Domain Replication Types

    Replication is set up with a source Data Domain system and one or more destination Data Domain sys-

    tems. There are 3 replication types:

    1. Collection

    2. Directory

    3. PoolEMC Data Domain System Administration Course Student Guide 171

  • Collection Replication

    With collection replication, user accounts and passwords are replicated from the source to the destination.

    Any changes made manually on the destination are overwritten after the next change is made on the

    source. It is recommended that changes be made only on the source.

    Entire /backup directory replicated

    Entire /backup directory duplicated

    Full system data replication mirror

    Immediate accessibility at destination

    Other than receiving data from the source, the destination is read only. The context must be broken by using the replication break command to make it read and writable.

    All user accounts and passwords are replicated from source172 Module 7: Manage Data Replication and Recovery

  • Directory Replication

    Directory replication provides replication at the level of individual directories. Each Data Domain system

    can be the source or destination for multiple directories and can be a source for some directories and a

    destination for others. During directory replication, a Data Domain system can perform normal backup and

    restore operations. A destination Data Domain system must have available storage capacity that is at least

    the post-compressed size of the expected maximum size of the source directory. A single destination Data

    Domain system can receive backups from both CIFS and NFS clients as long as separate directories are

    used for each. Do not mix CIFS and NFS data in the same directory. When replication is initialized, a desti-

    nation directory is created automatically if it does not already exist. After replication is initialized, owner-

    ship and permissions of the destination directory are always identical to those of the source directory. At

    any time, due to differences in global compression, the source and destination directory can differ in size.

    Subdirectories under /backup are duplicated

    Destination must have available storage

    Can receive backups from both CIFS and NFS clients

    Do not mix CIFS and NFS data in same directory

    Destination directory created automatically EMC Data Domain System Administration Course Student Guide