NetApp FAS3050-R200 EMC Centera Report

22
July 2006 www.veritest.com • [email protected] Network Appliance™ FAS3050, R200, and EMC Centera G4 Performance Comparison Test report prepared under contract from Network Appliance Executive Summary !"#$%$ &$$ ’())*+ , -( . ’/$$# "01’- 2$" The back-end storage architecture is not the only factor when evaluating performance. Archival performance also depends on the application use and application server configuration. For this comparison, we generated 80.4 million e-mails using Microsoft LoadSim 2003, populating eight Exchange 2003 servers. All Exchange servers had 4GB of RAM and dual Xeon processors (2.8GHz to 3.2GHz). Likewise, the Symantec Enterprise Vault servers were equipped with 4GB of RAM and dual Xeon processors (2.8GHz up to 3.2GHz). We used a set of four to eight Enterprise Vault servers to archive, search, and re-index up to 1.72 million e-mails from the total population of 80.4 million e-mails. Using multiple large-memory servers allowed us to evenly distribute threads across back-end interfaces and maximize bandwidth utilization of each platform. It should be noted that we used a Network Appliance FAS3050 and a FAS3020 as the primary back-end storage for the Exchange and SQL databases. These were mutually exclusive arrays, separate from the archive storage platforms. The Exchange host servers connected to the FAS3050 and the FAS3020 via Fibre Channel using a Brocade 3800 Silkworm Fibre Channel switch. A detailed diagram of the performance test bed can be found in “Testing Methodology.” $$3 40-’/’- %$3 40--’/-’- ,!"#$%$+ #53 2, 6+ 7 "8)*+ +5 &$$6,8+%3 ,-!"#$%$ )*+3 &-0!"#$%$ ,- !"#$%$5$3 )*+7&$$9+3 - ,-!"#$%$% )*+ ,!"#$%$#%3 2,6+ 7 "8)*+ ,-!"#$%$ )*+ 3 4-- ,-!"#$%$+53 )*+ !"#$%$-/-0 )*+-- ,!"#$%$+3 2, )!"#$%$0%$3 Key Findings

Transcript of NetApp FAS3050-R200 EMC Centera Report

Page 1: NetApp FAS3050-R200 EMC Centera Report

� � �����������

July 2006 www.veritest.com • [email protected]

Network Appliance™ FAS3050, R200, and EMC Centera G4 Performance Comparison Test report prepared under contract from Network Appliance

Executive Summary

���� ������������

��������������� ���������

����������������� �����

���������������� ���������� ��

������ �� �������������������

!�"#$%$��������&$$������

'()�)���� ��*+ ������ �� ��

��������� �������� ������

������������,������ �������

���� �������-�����(�� �����.�

'/�������$$#�����

"0������1�'��� � ������-���

�2 $�"� �� The back-end storage architecture is not the only factor when evaluating performance. Archival performance also depends on the application use and application server configuration. For this comparison, we generated 80.4 million e-mails using Microsoft LoadSim 2003, populating eight Exchange 2003 servers. All Exchange servers had 4GB of RAM and dual Xeon processors (2.8GHz to 3.2GHz). Likewise, the Symantec Enterprise Vault servers were equipped with 4GB of RAM and dual Xeon processors (2.8GHz up to 3.2GHz). We used a set of four to eight Enterprise Vault servers to archive, search, and re-index up to 1.72 million e-mails from the total population of 80.4 million e-mails. Using multiple large-memory servers allowed us to evenly distribute threads across back-end interfaces and maximize bandwidth utilization of each platform. It should be noted that we used a Network Appliance FAS3050 and a FAS3020 as the primary back-end storage for the Exchange and SQL databases. These were mutually exclusive arrays, separate from the archive storage platforms. The Exchange host servers connected to the FAS3050 and the FAS3020 via Fibre Channel using a Brocade 3800 Silkworm Fibre Channel switch. A detailed diagram of the performance test bed can be found in “Testing Methodology.”

��������� �$$3�4����0����������-�����������'/�����������������'��� � ������-���

�� �� �� %$3�4����0����������-�������- �'/�������������- �'��� � ������-���

�� �� �

����� ����������������� ��������,�����!�"#$%$������+����������������#53������ ���������2,

�����6+�������7���"�� ��������8�)���� ��*+������+5������������������������� ������������ ������&$$�6������,����8�����+%3������ �

� ��������,� ��������- ���������������������!�"#$%$�� �������� �� ����� ������)���� ��*+����������� ����� ������3� ������������������������� �������� �� �����

� &-��������������� 0�� �������������������������!�"#$%$�� �������� �� �����

� �����������,� ��������- ������� ��������!�"#$%$�����5$3������ ����������)���� ��*+7�����&$$�����9+3������

� ������-��,� ��������- ������� �������!�"#$%$����� %������������ ��������)���� ��*+

����� ����������������� ��������,�����!�"#$%$�����#%3������ ���������2,�����6+�������7���

"�� ��������8�)���� ��*+�������������� ������������ ��� ��������,� ��������- ��������������������������������!�"#$%$�� ������

�� �� ����� ������)���� ��*+����������� ����� ��������3 � 4- �������-��,� ��������- ������� �������!�"#$%$�����+53������ ������

����)���� ��*+� ��!�"#$%$��- �����������/� ��-��������������������0������ ����������

)���� ��*+� -������-��� ��������������������

��������� ����������,�����!�"#$%$��� �� �������������+3������ ���������2,

�����)���� ������������������������!�"#$%$�������0�����������%$3

Key Findings

Page 2: NetApp FAS3050-R200 EMC Centera Report

���� ������������!�"#$%$��� �'()�)���� ��*+�'��� � ������-����� �� ������)���� ����� ����������

The EMC Centera G4 configuration followed the EMC Centera Best Practices Guide. A total of 16 G4 nodes were deployed. We assigned 4 nodes access roles, with the remaining 12 nodes assigned a storage role. Each storage node contained four 500GB SATA drives, for a total of forty-eight 500GB drives. All nodes were dedicated to their assigned role. This 3:1 storage node-to-access-node ratio exceeds the EMC best practices for small object workloads (<100K avg). The Centera was deployed using the default CPM data protection scheme as well as the Storage Strategy Performance setting to ensure optimal write performance in small object environments such as the e-mail archival environment.

We compared the Centera G4 against two platforms from Network Appliance: the NearStore® R200 and a single-node FAS3050. The R200 uses a single controller, or node. as well. We configured the R200 with one aggregate (or pool) of forty-eight 320GB SATA drives and four dedicated Gigabit Ethernet interfaces. The FAS3050 was also configured with four dedicated Gigabit Ethernet interfaces and a single aggregate of twenty-four 250GB SATA drives. Aside from capacity differences, both EMC and Network Appliance platforms used 7200 RPM SATA drives. The Enterprise Vault archive servers communicated to the Centera via the Centera access APIs. We used a standard CIFS or Windows® share connection to the FAS3050 and R200 for their respective tests.

We ran two scenarios. In the first scenario, we conducted what was termed a 100% deployment, composed of eight Exchange 2003 servers and eight Enterprise Vault servers, along with one Microsoft SQL server. We believed that the eight Exchange 2003 servers would generate a sufficient workload for the eight Enterprise Vault servers. In the second scenario, we conducted a 50% deployment using four Exchange 2003 servers and four Enterprise Vault servers, with one Microsoft SQL server. To create our simulated e-mail environment, we used the industry-standard Microsoft Exchange Server 2003 Load Simulator (LoadSim), available from Microsoft at http://www.microsoft.com/downloads/details.aspx?familyid=92EB2EDC-3433-47CA-A5F8-0483C7DDEA85&displaylang=en. Using the LoadSim default message mixture, we generated 80.4 million e-mail messages across 15,000 mailboxes, for an average of 5,360 e-mail messages per user. Each Exchange Server also contained 5,000 unique e-mails (5 mailboxes with 1,000 messages each), generated using the Mailsend.exe tool and containing a 50/50 split of embedded text and attached text with a unique string used for retrieval by the Symantec Enterprise Vault Discovery Accelerator application. We stored these e-mails on the FAS3050 and FAS3020 Fibre Channel connected arrays mentioned earlier. For the archive storage hardware, we used an authorized service partner (Unisys) and the publicly available best practices for the installation and configuration of the Centera G4. We also used the publicly available best practices documentation from Network Appliance to provision the FAS3050 and the R200. We archived approximately 1.7 million e-mails from a population of 80.4 million for each execution of the Enterprise Vault archiving tasks during the 100% deployment testing. The archival volume was half this value when running the 50% deployment tests. Overall we found that a single-node FAS3050 with twenty-four 250GB 7200RPM drives scaled performance at a rate 42% greater than a 16-node Centera G4 with forty-eight 500GB 7200RPM drives, even as we continued to fill the file systems on the FAS3050 past 50% full. Moving from the 50% deployment up to the 100% deployment, the FAS3050 archival rate increased from 157MB/minute to 261MB/minute, a 104MB/minute improvement. The Centera G4 moved from 116MB/minute to 189MB/minute, an improvement of only 73MB/minute.

Page 3: NetApp FAS3050-R200 EMC Centera Report

���� ������������!�"#$%$��� �'()�)���� ��*+�'��� � ������-����� �� ������)���� ����� �����#�����

0

50

100

150

200

250

300

FAS3050 Centera

50% Deployment 100% Deployment

MB/Minute

Enterprise Vault Archival Scalability

104 MB/Minute

73 MB/Minute

Please refer to “Testing Methodology,” later in this report, for complete details of how we conducted the performance testing on the FAS3050, R200, and Centera G4.

Page 4: NetApp FAS3050-R200 EMC Centera Report

���� ������������!�"#$%$��� �'()�)���� ��*+�'��� � ������-����� �� ������)���� ����� �����+�����

Enterprise Vault Archival Throughput Results 100% Deployment: Eight Enterprise Vault Servers with Eight Exchange Servers The 100% deployment test consisted of eight Enterprise Vault servers to archive e-mail located on eight Exchange servers. For a detailed view of the test environment, see “Testing Methodology,” later in this report.

Establish Baseline Enterprise Vault Archival Throughput To establish a baseline for archival throughput, we enabled 35 LoadSim-generated mailboxes and the 5 Mailsend.exe-generated mailboxes in Enterprise Vault for each corresponding Exchange Server. We monitored the archival throughput by capturing the archival rate as reported in the saveset SQL table. We monitored this rate for 60 minutes to determine the average baseline archival throughput. Monitoring the archival process over the course of one hour, we found that the Network Appliance R200 (single node) achieved the highest archival performance rate, averaging 275MB/minute ingestion of content storage—45% faster than the Centera G4 and 5% faster than the FAS3050. The FAS3050 (single node) averaged 261MB/minute, while the Centera G4 ingested content at a rate of 189MB/minute, putting the FAS3050 ingestion rate at a level 38% greater than that of the Centera G4. Moving forward, we considered this the baseline archival rate for each platform.

0

50

100

150

200

250

300

R200 FAS3050 Centera

MB/Minute

Baseline Enterprise Vault Archival Rates100% Deployment

Page 5: NetApp FAS3050-R200 EMC Centera Report

���� ������������!�"#$%$��� �'()�)���� ��*+�'��� � ������-����� �� ������)���� ����� �����%�����

RAID vs. RAIN Both EMC and Network Appliance provide storage resiliency features; however, there are significant differences in the individual drive replacement procedures between the Network Appliance FAS3050/R200 and the EMC Centera G4. The Network Appliance FAS3050/R200 products support the removal of a single failing disk and the replacement of the failing disk on a one-to-one basis without bringing the disk enclosure offline, as would be necessary in standard storage arrays. In order to test a drive failure scenario on the Centera, we had to work within its unique RAIN architecture. (http://www.emc.com/products/systems/centera/pdf/C938.6_Centera_PDG_ldv.pdf). With the EMC implementation of a RAIN architecture, the storage node containing a single failed disk must be brought offline prior to replacing the failed disk. Each Centera storage node contains four 500GB SATA drives. In consultation with EMC technical resources, we received the following instructions for conducting a drive failure/pull test:

� If a single drive in a storage node is pulled. it cannot immediately be returned into the Centera. The drive must be returned to EMC for repair and a spare drive obtained.

� Upon receipt of the replacement drive, the procedure for the single-drive replacement requires:

o Shut down the storage node o Replace the drive o Restart the storage node

In order to simulate a drive failure given this constraint, all drive failure test data used in this report were obtained by powering off a Centera storage node from the front panel and unplugging the dual power supplies. The second storage node powered off in the same manner. Powering the storage nodes back up brought them back online.

Page 6: NetApp FAS3050-R200 EMC Centera Report

���� ������������!�"#$%$��� �'()�)���� ��*+�'��� � ������-����� �� ������)���� ����� �����2�����

Single-Disk Failure In the SATA market, it is not unreasonable to expect drive failures in 7/24/365 archival environments. In order to test a drive failure condition, we physically pulled a single-disk drive from the drive shelf enclosures on the FAS3050 and on the R200. Given the RAIN architecture of the Centera and the drive replacement procedure prescribed by EMC (and summarized in the previous section, “RAID vs. RAIN”), we simply powered off a single storage node (vs. an access node). To determine the performance impact on each system, we measured archival rates using EV Archival Throughput as reported on the Enterprise Vault saveset SQL table. A single-drive failure on the Network Appliance systems resulted in little or no performance degradation. The R200 decreased 6.5% during this drive failure test. The FAS3050 actually showed a slight statistical improvement in throughput capacity (up 1.5%), albeit at a higher utilization rate. A storage node failure on the Centera G4 resulted in a 22% drop in performance. In an archival test while experiencing a single-drive failure, the FAS3050 had an ingestion rate 80% greater than the Centera G4 (265MB/minute vs. 147MB/minute), while the R200 showed a 74% advantage over the Centera G4.

0

50

100

150

200

250

300

R200 FAS3050 Centera

Baseline Single Drive Failure

MB/Minute

Enterprise Vault Archival Ingestion RatesSingle Drive Failure Test

Double-Disk Failure Today, the likelihood of experiencing a media/bit error during reconstruction of an eight-drive RAID 4/5 set with 500GB SATA drives has reached 28% (see Appendix C, “Application Data Integrity with RAID-DP,” for supporting information). With that in mind, we constructed a dual-drive failure scenario. To determine performance impact to each system, we again measured archival rates using EV Archival Throughput as reported on the Enterprise Vault saveset SQL table. In a two-drive failure scenario, the FAS3050 again maintained an archival rate within 5% of its baseline. However, the R200 dropped 36% from its baseline archival rate, while the Centera fell 41% from its baseline. Relative to each other under a two-drive failure scenario, the FAS3050 archived 124% (2.24 times) faster than the Centera G4. The R200 experienced an archival rate that was 56.36% faster than the Centera G4. Again, it’s important to note that the FAS3050 used half the data drives of the Centera G4.

Page 7: NetApp FAS3050-R200 EMC Centera Report

���� ������������!�"#$%$��� �'()�)���� ��*+�'��� � ������-����� �� ������)���� ����� �����9�����

0

50

100

150

200

250

300

R200 FAS3050 Centera

Baseline Single Drive Failure Dual Drive Failure

MB/Minute

Enterprise Vault Archival Ingestion RatesDual Drive Failure Test

Enterprise Vault Archival Throughput During a Discovery Accelerator Retrieval Task Upon completion of the archival test for the 35 LoadSim-generated and 5 Mailsend.exe-generated e-mails, we enabled 40 additional LoadSim-generated mailboxes per EV server and initiated a second archival run—a net new archival test. While these Enterprise Vault tasks were running, we launched a concurrent Enterprise Vault Discovery Accelerator process to retrieve the 40,000 unique e-mails and place them in a PST file. We again measured the performance impact to archival throughput using the EV Archival Throughput tool as reported on the EV saveset SQL table. We did not fail drives during this test. We have merely maintained the results of the single-drive failure test for reference. All tests were mutually exclusive. In this particular test, the goal was to measure the impact of the discovery process on archival performance.

0

50

100

150

200

250

300

R200 FAS3050 Centera

Baseline Single Drive Failure Dual Drive Failure Discovery

MB/Minute

Enterprise Vault Archival Ingestion RatesDuring Discovery

0:001:002:003:004:00

6:005:00

7:008:009:00

10:0011:0012:00

Retrieval Task Time (HH:MM)

Page 8: NetApp FAS3050-R200 EMC Centera Report

���� ������������!�"#$%$��� �'()�)���� ��*+�'��� � ������-����� �� ������)���� ����� �����5�����

During the Discovery Accelerator search and retrieval of the 40,000 unique e-mails, the archival rate for the FAS3050 was unaffected. We did see that the R200 was down 13% from its baseline archival performance. The Centera G4 fell 15% from its baseline. Relative to each other, we found that during the discovery process, the FAS3050 ingested content at a rate 62.5% greater than that of the Centera G4. On average, the R200 archival throughput rate was 48.7% faster than the Centera. The discovery process itself ran well across all systems, completing in 1 hour 30 minutes on the Centera; 1 hour 55 minutes on the FAS3050; and 2 hours 20 minutes on the R200.

Rebuild EV Indexes During Archival Throughput To measure the impact on archival throughput during an index rebuild, we failed one index on the Enterprise Vault server by running the “indexvolumereplay” utility. The performance impact was measured using the Enterprise Vault Archival Throughput as reported on the EV saveset SQL table. Once again, we did not fail drives during this test. The goal was to measure the impact of an index rebuild on archival performance. During this test, we observed the largest impact to the FAS3050 in any test to this point; its performance dropped 22% from baseline archival rate. Likewise, the R200 archival rate dropped to its lowest level, off 52% from its baseline archival rate. The R200 showed an archival rate 18% below that of the Centera G4 during this index rebuild test. The Centera G4 dropped 17% from its baseline. Relative to the FAS3050, the FAS3050 maintained a 30% performance advantage over the Centera G4. Another way to look at this is that the FAS3050 during an index rebuild statistically outperformed the Centera G4 baseline. The following figure summarizes all measured archival rates, including the index rebuild test.

0

50

100

150

200

250

300

Baseline 275 261 189

Single Drive Failure 257 265 147

Dual Drive Failure 176 251 112

Discovery 237 260 160

Index Rebuild 132 203 156

R200 FAS3050 Centera

MB/Minute

Enterprise Vault Archival Ingestion RatesDuring an Index Rebuild

Page 9: NetApp FAS3050-R200 EMC Centera Report

���� ������������!�"#$%$��� �'()�)���� ��*+�'��� � ������-����� �� ������)���� ����� �����:�����

50% Deployment: Four Enterprise Vault Servers with Four Exchange Servers For a more moderate test, we cut the test environment in half, using four Microsoft Exchange 2003 servers and four Enterprise Vault servers. For this series of tests, we compare the results for the FAS3050 with the results for the Centera G4. Establish Baseline Enterprise Vault Archival Throughput To establish a baseline for archival throughput, we enabled 35 LoadSim-generated mailboxes and 5 Mailsend.exe-generated unique mailboxes in Enterprise Vault for each corresponding Exchange Server. We monitored the archival throughput by capturing the archival rate as reported in the saveset SQL table. We monitored this rate for 60 minutes to determine the average baseline archival throughput. Monitoring the archival process over the course of one hour, we found that the FAS3050 averaged 157 MB/minute, while the Centera G4 ingested content at a rate of 116MB/minute, putting the FAS3050 ingestion rate at a level 35% greater than that of the Centera G4. Moving forward, we considered this the baseline archival rate for each platform.

0

50

100

150

200

250

300

FAS3050 Centera

MB/Minute

Baseline Enterprise Vault Archival Rates50% Deployment

Single-Disk Failure In order to test a drive failure condition, we physically pulled a single-disk drive from the drive shelf enclosures on the FAS3050. We powered off a single storage node (vs. an access node) on the Centera G4. To determine the performance impact on each system, we measured archival rates using EV Archival Throughput as reported on the Enterprise Vault saveset SQL table. As in the 100% deployment scenario, a single-drive failure on the FAS3050 showed no impact on its overall archival performance. A storage node failure on the Centera showed a subsequent drop in archival performance of 11%. The FAS3050 ingested content at a rate 54% greater than the Centera G4.

Page 10: NetApp FAS3050-R200 EMC Centera Report

���� ������������!�"#$%$��� �'()�)���� ��*+�'��� � ������-����� �� ������)���� ����� ������$�����

0

50

100

150

200

250

300

FAS3050 Centera

Baseline Single Drive Failure

MB/Minute

Enterprise Vault Archival Ingestion RatesSingle Drive Failure Test

Double-Disk Failure We then conducted a dual-drive failure test for the 50% deployment test environment. To determine performance impact to each system, we again measured archival rates using EV Archival Throughput as reported on the Enterprise Vault saveset SQL table. In a two-drive failure scenario, the FAS3050 maintained an archival rate within 5% of its baseline. With the failure of two storage nodes, the Centera experienced a 12% decrease in performance relative to its baseline. Relative to each other, the FAS3050 archived e-mail 48% faster than the Centera G4.

0

50

100

150

200

250

300

FAS3050 Centera

Baseline Single Drive Failure Dual Drive Failure

MB/Minute

Enterprise Vault Archival Ingestion RatesDual Drive Failure Test

Page 11: NetApp FAS3050-R200 EMC Centera Report

���� ������������!�"#$%$��� �'()�)���� ��*+�'��� � ������-����� �� ������)���� ����� ������������

Enterprise Vault Archival Throughput During Discovery Upon completion of the archival test for the 35 LoadSim-generated and 5 Mailsend.exe-generated emails, we enabled 40 additional LoadSim-generated mailboxes per EV server and initiated a second archival run. While these Enterprise Vault tasks were running, we launched a concurrent Enterprise Vault Discovery Accelerator process to retrieve the 20,000 unique e-mails and place them in a PST file. We again measured the performance impact to archival throughput using the EV Archival Throughput tool as reported on the EV saveset SQL table. We did not fail drives during this test. The goal was to measure the impact of the discovery process on archival performance.

0

50

100

150

200

250

300

FAS3050 Centera

Baseline Single Drive Failure Dual Drive Failure Discovery

MB/Minute

Enterprise Vault Archival Ingestion RatesDuring Discovery

0:001:002:003:004:00

6:005:00

7:008:009:00

10:0011:0012:00

Retrieval Task Time (HH:MM)

During the Discovery Accelerator search and retrieval of the 20,000 unique e-mails, the archival rate for the FAS3050 fell 7% from its baseline rate, dropping from 157MB/minute to 145-MB/minute. The Centera G4 fell 20% from its baseline. Relative to each other, we found that during the discovery process the FAS3050 ingested content at a rate 58% greater than that of the Centera G4. The discovery process itself ran well for both systems, completing in 57 minutes on the FAS3050 and 1 hour 5 minutes on the Centera. Rebuild EV Indexes During Archival Throughput To measure the impact on archival throughput during an index rebuild, we failed one index on the Enterprise Vault server by running the “indexvolumereplay” utility. The performance impact was measured using the Enterprise Vault Archival Throughput tool as reported on the EV saveset SQL table. Once again, we did not fail drives during this test. The goal was to measure the impact of an index rebuild on archival performance. The index rebuild dropped the content archival rate 24%, from 157MB/minute to 118MB/minute. Once again, by way of comparison, the FAS3050 during an index rebuild statistically outperformed the Centera G4 baseline. The Centera G4 archival performance dropped 14% during the index rebuild. Relative to the FAS3050 performance during its index rebuild, the Centera G4 was 18% slower. The following figure summarizes all measured archival rates in each test.

Page 12: NetApp FAS3050-R200 EMC Centera Report

���� ������������!�"#$%$��� �'()�)���� ��*+�'��� � ������-����� �� ������)���� ����� �����������

0

50

100

150

200

250

300

Baseline 157 116

Single Drive Failure 157 103

Dual Drive Failure 149 101

Discovery 145 92

Index Rebuild 118 100

FAS3050 Centera

MB/Minute

Enterprise Vault Archival Ingestion RatesDuring an Index Rebuild

Page 13: NetApp FAS3050-R200 EMC Centera Report

���� ������������!�"#$%$��� �'()�)���� ��*+�'��� � ������-����� �� ������)���� ����� ������#�����

Testing methodology For testing purposes, we used the following assumptions and facts: o Enterprise Vault Journaling was turned off for all testing. o All tests were performed in a serial fashion for each storage product. o Each storage platform had dedicated access to the test environment during its test runs. o For our test scenarios, we used a LoadSim-generated workload consisting of 80.4 million e-mails using

the default LoadSim message mixture. o We stored Microsoft Exchange and Microsoft SQL Server data on Fibre Channel attached FAS3050 and

FAS3020 storage appliances. o During the 100% deployment test, eight Enterprise Vault servers archived e-mail messages from eight

Microsoft Exchange servers to the storage targets. o 100% load for all storage targets was considered to be 560,000 e-mails per hour. o 50% load for all storage targets was considered to be 280,000 e-mails per hour. For testing purposes, we created a typical medium-sized enterprise Microsoft Exchange e-mail environment with 80+ million e-mails, spread evenly across 15,000 mailboxes, and archived a percentage of those e-mails to three types of NAS storage solutions, using the Symantec Enterprise Vault application. We then measured the archival throughput performance of Enterprise Vault for each of the NAS storage devices while the archival tasks were running. Each test was conducted independently. We generated two deployment levels, based upon differing numbers of available Enterprise Vault and Exchange servers. We first deployment level consisted of four Enterprise Vault Servers and four Microsoft Exchange 2003 Servers configured in a one-to-one relationship. This deployment is referred to as the 50% deployment level. The second deployment level consisted of eight Enterprise Vault servers and eight Microsoft Exchange 2003 servers, also configured in a one-to-one relationship. This deployment is referred to as the 100% deployment level. For the Microsoft Exchange servers, 15,000 users were generated using the LoadSim 2003 application to populate the Exchange data evenly across the available Exchange Servers. We then created five unique mailboxes per Exchange server, and used the mailsend.exe utility to generate 1,000 unique e-mails for each of those accounts. Therefore, for the 50% testing there were 20,000 unique e-mails available for Discovery Accelerator search and retrieval, and 40,000 unique e-mails available at the 100% level. We started by determining the baseline throughput when only archival tasks were executing. We then introduced scenarios that one would expect to encounter in a real-world deployment, and measured the Enterprise Vault archival throughput during those situations. Each deployment was configured with a single Windows 2003 Active Directory server to manage the AD domain and a single SQL Server 2000 to contain the database information for the Enterprise Vault application. Further, we broke the Enterprise Vault archival tasks into two rounds of execution. The first round archived e-mail for 35 LoadSim-generated mailboxes per Exchange Server, along with the 5 unique mailboxes on each Exchange Server. For this first round of archiving tasks, we gave Enterprise Vault 45 minutes to ramp up. We then measured for 60 minutes to determine the baseline archival throughput. To simulate a single-disk failure, we removed a single-parity disk from the active aggregate on the NetApp NAS devices, and powered off a single storage node on the Centera G4. We then measured the Enterprise Vault archival throughput for 60 minutes to monitor for any performance change. We then simulated a double-disk failure by removing a second disk in the aggregate on the NetApp NAS devices, and by powering off a second storage node on the Centera G4. Again, we measured the archival throughput to monitor for any change in performance. We then reinserted the disks on the NetApp NAS devices, powered on the downed nodes on the Centera G4, and waited for the archival tasks to finish before moving on to the next scenario. A second Enterprise Vault archiving task was executed using 40 newly enabled LoadSim-generated mailboxes for each Exchange server. We again allowed Enterprise Vault 45 minutes to ramp up, and monitored for 60 minutes to verify the baseline archival throughput rate. We then executed a Discovery Accelerator search and retrieval to retrieve the unique e-mails previously archived, while the current archiving

Page 14: NetApp FAS3050-R200 EMC Centera Report

���� ������������!�"#$%$��� �'()�)���� ��*+�'��� � ������-����� �� ������)���� ����� ������+�����

tasks were running. We then monitored for any change in archival throughput performance during the retrieval process. We also measured the length of time it took Discovery Accelerator to retrieve the unique messages. Once the DA retrieval was complete, we used the Symantec IndexVolumeReplay utility to rebuild a mailbox index, previously archived, for each Enterprise Vault server. Again, we monitored for any change in archival throughput. Server performance impact was measured through Perfmon, to monitor the relative health and performance of the Enterprise Vault, Exchange, SQL, and domain controller servers. During the Centera G4 testing, the Centera Viewer utility was used to monitor the performance of the G4. During the FAS3050 and R200 testing, the native Data ONTAP® command sysstat was used to monitor the performance of the FAS3050/R200 devices. To monitor the performance of the Brocade 3800 Silkworm FCP switch, we ran the portperfshow command continuously to real-time monitor the throughput for each Exchange server to verify that there were no performance bottlenecks coming from the Exchange storage. We used the NetGear administration GUI to monitor the Gigabit Ethernet switch to make sure that no errors or performance bottlenecks were experienced on that device as well. The following code shows the SQL query used to retrieve the EV archival throughput rate.

select "Archived Date" = cast (archiveddate as smalldatetime), "Hourly Rate" = count (*)*60, "Av Size" = sum (itemsize)/count (*) from saveset group by cast (archiveddate as smalldatetime) order by "Archived Date" desc

Page 15: NetApp FAS3050-R200 EMC Centera Report

���� ������������!�"#$%$��� �'()�)���� ��*+�'��� � ������-����� �� ������)���� ����� ������%�����

Network ApplianceFAS305024 250GB

SATA4 Gbe

Ethernet

Network Appliance

R20048 320GB

SATA4 Gbe

Ethernet

EMC Centera 4 Access/12

Storage Nodes

48 500GB SATA4 Gbe

Ethernet

Page 16: NetApp FAS3050-R200 EMC Centera Report

���� ������������!�"#$%$��� �'()�)���� ��*+�'��� � ������-����� �� ������)���� ����� ������2�����

Detailed Test Execution Performance Test Case #1: Baseline Throughput, 100% Deployment

• Enterprise Vault was configured to enable 35 LoadSim-generated mailboxes per EV server, as well as 5 unique mailboxes per EV server.

• Microsoft Performance Monitor was executed on each of the EV, Exchange, and SQL servers. • The Data ONTAP sysstat command was executed on the Exchange storage heads. • The portperfshow command was executed on the Brocade 3800 FCP switch. • The NetGear GUI was launched to monitor the Gigabit Ethernet switch. • We logged into each EV, Exchange, SQL, and AD server via the Admin interface to execute Task

Manager for real-time performance monitoring and Event Viewer monitoring. • During Centera G4 testing, the Centera Viewer and Centera Monitor GUIs were launched over the

crossover connection from EV Server 2 • During FAS3050 and R200 testing, the sysstat command was executed on each storage head to

monitor its health. • Once all monitors were up, the EV Archiving Tasks were executed. • During execution, our test engineer monitored the MSMQ private message queues for EV to verify

that the messages were being processed. • The test engineer ran the “archrate.sql” query on each of the EV databases on the SQL server to

capture the EV archival throughput rate for each EV server. This data was then aggregated to show a cumulative throughput rate being processed.

• For the first 45 minutes, the archiving rate was not recorded for testing purposes to allow EV to ramp up.

• After 45 minutes, the next 60 minutes of “archrate.sql” output was captured for each server and recorded on an Excel spreadsheet.

Performance Test Case #2: Single-Disk Failure, 100% Deployment

• After the baseline rate was determined, the test engineer initiated a single-disk failure scenario. • For the Centera G4 testing, this was accomplished by powering off the top-most storage node in the

array. • For the FAS3050 and FAS3020 devices, a single disk (parity) was pulled from the active aggregate in

the array. • EV archival throughput rate was captured for 60 minutes using the “archrate.sql” query.

Performance Test Case #3: Double-Disk Failure, 100% Deployment

• After test case #2 was complete, the test engineer introduced a double-disk failure. • For the Centera G4 testing, this was accomplished by powering off a second storage node (the

second storage node from the top). • For the FAS3050 and R200, this was accomplished by removing a second disk (data) from the active

aggregate in the array. • EV archival throughput rate was captured for 60 minutes using the “archrate.sql” query. • EV tasks were allowed to finish. • Disk reconstruction and node on-lining were allowed to finish for each type of storage device.

Performance Test Case #4: DA Search and Retrieval, 100% Deployment

• Enterprise Vault was configured to enable a different set of 40 LoadSim-generated mailboxes per EV server.

• Microsoft Performance Monitor was executed on each of the EV, Exchange, and SQL servers. • The Data ONTAP sysstat command was executed on the Exchange storage heads. • The portpershow command was executed on the Brocade 3800 FCP switch. • The NetGear GUI was launched to monitor the Gigabit Ethernet switch. • We logged into each EV, Exchange, SQL, and AD server via the Admin interface to execute Task

Manager for real-time performance monitoring and Event Viewer monitoring.

Page 17: NetApp FAS3050-R200 EMC Centera Report

���� ������������!�"#$%$��� �'()�)���� ��*+�'��� � ������-����� �� ������)���� ����� ������9�����

• During Centera G4 testing, the Centera Viewer and Centera Monitor GUIs were launched over the crossover connection from EV Server 2

• During FAS3050 and R200 testing, the sysstat command was executed on each storage head to monitor its health.

• Once all monitors were up, the EV Archiving Tasks were executed. • During execution, the test engineer monitored the MSMQ private message queues for EV to verify

that the messages were being processed. • The test engineer ran the “archrate.sql” query on each of the EV databases on the SQL server to

capture the EV archival throughput rate for each EV server. This data was then aggregated to show a cumulative throughput rate being processed.

• For the first 45 minutes, the archiving rate was not recorded for testing purposes to allow EV to ramp up.

• After 45 minutes, the next 60 minutes of “archrate.sql” output was captured for each server and recorded on an Excel spreadsheet.

• The test engineer initiated a Discovery Accelerator search and retrieval to retrieve the unique e-mails to one or more PST files on one of the EV servers. We used EV Server 4 (dell2) to perform this operation.

• While the DA retrieval was occurring, the test engineer captured the EV archival throughput rate using the “archrate.sql” query. This data was captured until the DA retrieval stage completed. We did not monitor the time taken to merge the MSG files to the PST file because we were not interested in measuring the local disk performance of the EV server.

• The test engineer also recorded the length of time it took the DA retrieval portion to complete. Performance Test Case #5: EV Index Rebuild

• While the EV archiving tasks were still running, the IndexVolumeReplay tool was used to generate an index rebuild for one index per EV server.

• While the index rebuild was occurring along with the archival tasks, the test engineer used the “archrate.sql” query to capture the EV archival throughput rate for 60 minutes.

Page 18: NetApp FAS3050-R200 EMC Centera Report

���� ������������!�"#$%$��� �'()�)���� ��*+�'��� � ������-����� �� ������)���� ����� ������5�����

Appendix A: Test Bed Disclosures • The maxdirsize value was modified on the FAS3050 from the default 1% of RAM in KB to 1.4% to

allow all e-mail from each round of EV archiving tasks to complete. Pre-testing determined that this value did not affect the FAS3050 performance either positively or negatively.

• Each server in the test bed had an administration interface configured on a separate subnet, which was used to remotely log in via RDP to monitor the health of the servers and to execute necessary applications.

• For EV index storage, the FAS3050 used for Exchange storage (fas3050-01) had a volume created and shared out via FCP to the SQL Server. The SQL Server used Windows sharing to share out the LUN to each of the EV servers via the Gigabit Ethernet data network.

Page 19: NetApp FAS3050-R200 EMC Centera Report

���� ������������!�"#$%$��� �'()�)���� ��*+�'��� � ������-����� �� ������)���� ����� ������:�����

Appendix B: System disclosures Network Appliance FAS3020 -01 Exchange Storage Solution - SATA Storage Unit Network Appliance FAS3020 Disk Drives 23 – 7200 RPM SATA 150 (with 2 spares) Base Software version Network Appliance Data ONTAP Release 7.1

Table 1. Network Appliance FAS3020-01 Disclosure Information

Network Appliance FAS3050 -01 Exchange Storage Solution - FC Storage Unit Network Appliance FAS3050

Disk Drives Aggregate 1 - 77 10K RPM SCSI – (with 3 spares) Aggregate 2 - 13 15K RPM SCSI – (with 1 spare)

Base Software Version Network Appliance Data ONTAP Release 7.1

Table 2. Network Appliance FAS3050-01 Disclosure Information

Network Appliance FAS3050-02 Enterprise Vault Storage (250GB Drives) Storage Unit Network Appliance FAS3050

Disk Drives Aggregate 1 - 24 7200 RPM SATA 150 (with 2 spares)

Base Software Version Network Appliance Data ONTAP Release 7.1

Table 3. Network Appliance R200 Disclosure Information

Network Appliance R200 Enterprise Vault Storage (320GB Drives) Storage Unit Network Appliance R200 Disk Drives Aggregate 1 - 48 7200 RPM SATA 150 (with 2 spares) Base Software Version Network Appliance Data ONTAP Release 7.1

Table 4. Network Appliance FAS3050-03 Disclosure Information

EMC Centera G4 Storage

Storage Unit Four access nodes and twelve storage nodes Two Allied Telesyn AT9924T Managed Gigabit Switches

Access Nodes Four - 2.8GHz Intel Prescott processors/1GB RAM. Storage Nodes Twelve - 2.8GHz Intel Prescott processors/1GB RAM. Base Software Version 3.1.0-956-654-12195

Table 5. EMC Centera G4 Disclosure Information

Dell Computer Corporation PowerEdge 2650 Exchange Server (2) SQL Server (1) Domain Controller/Active Directory Server (1) Enterprise Vault (1) Processor / Speed / # of CPUs Dual Intel 2.8Ghz Xeon, 533MHz System RAM / Type 4GB (PC600 RDRAM) Network Adapter 2- Intel 82544 PRO/1000 XT Gigabit Ethernet Controller

OS Microsoft Windows 2003 Enterprise Edition

Installed HBAs 2 x Qlogic 2340 version 9.0.1.12 (SCSI miniport) Driver Date: 10/10/2004

Table 6. Dell PowerEdge 2650 Servers Used for Exchange, SQL, and Domain Controller/Active Directory

Page 20: NetApp FAS3050-R200 EMC Centera Report

���� ������������!�"#$%$��� �'()�)���� ��*+�'��� � ������-����� �� ������)���� ����� �����$�����

Dell Computer Corporation PowerEdge 2850 Enterprise Vault (2) Processor / Speed / # of CPUs Dual Intel 3.0Ghz Xeon, 800MHz System RAM / Type 4GB (PC600 RDRAM) Network Adapter 2- Intel 82544 PRO/1000 XT Gigabit Ethernet Controller

OS Microsoft Windows 2003 Enterprise Edition

Installed HBAs 2 x Qlogic 2340 version 9.0.1.12 (SCSI miniport) Driver Date: 10/10/2004

Table 7. Dell PowerEdge 2850 Servers Used for Enterprise Vault

Hewlett Packard ProLiant DL360 G3 Enterprise Vault Server (2) Processor / Speed / # of CPUs Dual Intel 2.8Ghz Xeon, 400MHz System RAM / Type 4GB PC 2100 DDR Network Adapter Dual HP NC7781 PCI-X Gigabit server adapters (embedded)

OS Microsoft Windows 2003 Enterprise Edition

Table 8. HP ProLiant DL360 Servers Used for Enterprise Vault

Hewlett Packard ProLiant DL380 G3 Enterprise Vault Server (1) Processor / Speed / # of CPUs Dual Intel 2.8Ghz Xeon, 400MHz System RAM / Type 4GB PC 2100 DDR Network Adapter Dual HP NC7781 PCI-X Gigabit server adapters (embedded)

OS Microsoft Windows 2003 Enterprise Edition

Table 9. HP ProLiant DL380 G3 Servers Used for Enterprise Vault

Hewlett Packard ProLiant DL380 G4 Exchange Server (2) Enterprise Vault Server (2) Processor / Speed / # of CPUs Dual Intel 3.2Ghz Xeon, 800MHz System RAM / Type / # of Slots 4GB (PC600 RDRAM) Network Adapter Dual port NC7782 Gigabit NIC

OS Microsoft Windows 2003 Enterprise Edition

Installed HBAs 2 x Qlogic 2340 version 9.0.1.12 (SCSI miniport) Driver Date: 10/10/2004

Table 10. Hewlett Packard DL380 G4 for Exchange, SQL, and Domain Controller/Active Directory

Networking Equipment

Fibre Channel Switch Brocade SilkWorm 3800 Firmware v3.1.3

NetGear GS748T

Gig-E Non Blocking Managed Switch

Table 11. Networking Equipment Used

Page 21: NetApp FAS3050-R200 EMC Centera Report

���� ������������!�"#$%$��� �'()�)���� ��*+�'��� � ������-����� �� ������)���� ����� �����������

Appendix C: Application Data Integrity with RAID-DP™

The dramatic increase in disk sizes, the relatively slower growth in disk bandwidth, the construction of disk arrays containing larger numbers of disks, and the use of less reliable and poorer performing varieties of disk such as ATA combine to increase the rate of double-disk failures. The use of algorithms can protect against double-disk failures and ensure adequate data integrity. For a complete discussion of algorithms that protect against two-disk failures, refer to the Usenix publication “Row-Diagonal Parity for Double Disk Failure Correction” http://www.usenix.org/publications/library/proceedings/fast04/tech/corbett/corbett.pdf (Peter Corbett, Bob English, Atul Goel, Tomislav Grcanac, Steven Kleiman, James Leong, and Sunitha Sankar).

28.0%1.4%Bit Error Likelihood – Single Parity(per reconstruction of an 8-drive RAID 4 or 5 set)

2 – 51 – 3Typical Disk Drive Replacements/Year (per year per 100 drives)

< 1 in 30 millionBit Error Likelihood – Dual Parity(per reconstruction of an 16-drive RAID-DP set)

4.0%0.2%Bit Error Likelihood(per spindle, based on typical disk drive specs)

SATASATA2FCFC1System Reliability EventSystem Reliability Event

28.0%1.4%Bit Error Likelihood – Single Parity(per reconstruction of an 8-drive RAID 4 or 5 set)

2 – 51 – 3Typical Disk Drive Replacements/Year (per year per 100 drives)

< 1 in 30 millionBit Error Likelihood – Dual Parity(per reconstruction of an 16-drive RAID-DP set)

4.0%0.2%Bit Error Likelihood(per spindle, based on typical disk drive specs)

SATASATA2FCFC1System Reliability EventSystem Reliability Event

1 Bit error rates for a 10K rpm 300GB FC drive2 Bit error rates for a 7200 rpm 500GB SATA drive

� RAID-DP is needed to ensure primary application data availability. � SATA drive failure rates are estimated at up to double FC over a three- to five-year service life. � Drive failures result in RAID reconstructions, and SATA reconstructions occur more often. � An unrecoverable drive error during a RAID reconstruction results in data loss. � Four reconstructions per year = ~100% chance of data loss with 500GB SATA and RAID 5.

• Disk drive replacement: The failure of a disk drive in a single-parity RAID group presents a substantial risk

of data loss during the RAID rebuild/reconstruction process. Some disk replacements result from transient drive, array, or loop problems and thus are not hard failures; but any event that causes a RAID rebuild creates a risk. How often drives are replaced is typically a function of many factors, including physical environment, I/O workload, drive age, drive production batch, drive design generation, etc. Experience to date shows that SATA drives can be every bit as reliable as FC drives when they are used for their appropriate workloads, but there is some indication that SATA drive failure rates may increase more than FC as the drives age. More important than debating how often one or the other drive type fails is the understandings that drive failures will occur, thus presenting risk of data loss without double-parity protection.

• Bit error likelihood: Typical non-recoverable (or uncorrectable) error rate (UER) specs are 1 error in 10^15

bits read for FC drives, and 1 in 10^14 for SATA drives.

Note: • The specs assume that full data recovery procedures are available, but these entail more read retries

than RAID array system timeout limits allow, which may result in not achieving the spec error rate (that is, a higher incidence of errors may result). The 0.2% rate applies to 300GB FC drives; it is the number of bits per drive divided by the UER spec of 1 in 10^15. The same calculation for a 500GB SATA drive with a UER spec of 1 in 10^14 results in a 4% rate.

Page 22: NetApp FAS3050-R200 EMC Centera Report

���� ������������!�"#$%$��� �'()�)���� ��*+�'��� � ������-����� �� ������)���� ����� ����������

• Single-parity RAID bit error likelihood: The product of the individual drive error likelihood times the seven drives being read during rebuild of a 7+1 single-parity RAID group.

• Double-parity RAID bit error likelihood: A more complex calculation that incorporates the probability of a

second drive failing during RAID reconstruction of another drive in the same 16-drive RAID-DP group and a bit error occurring while both drives are in rebuild.

� The probability of two disks failing simultaneously: � The rate of single-disk failures per year squared times the reconstruction time in years times the

number of two-disk combinations in the array. � This result is (0.029 failures/year)^2 x 7.9E-4 years x 16*15/2 combinations, which equals 0.08E-

3 failures per year. � The probability of another media error in the array is the number of bits left to read times the error

rate per bit: 500GB*8 * 14 * 1E-14 = 0.56 � Multiplying, the result is 0.08E-3 * 0.56 = 0.045E-3 data loss failures per year. � This is quite a lot higher than 1 in 30 million, so it is a conservative estimate; however, it is a

significant improvement in data protection relative to single-parity RAID. • The rate of single-disk failures in a 16-disk single-parity RAID group:

� 0.029 * 16 = 0.464 � The probability of a bit error during reconstruction is 500GB*8 * 15 * 1E-14 = 0.60 � This gives a total failure rate of 0.278 data loss failures per year, which is 6,186 times higher than the

RAID-DP rate. This ratio is consistent with the claim that RAID-DP gives about 1,000 to 10,000 times better data protection than single-parity protection does.

• Bit error rate sources:

• http://www.hitachigst.com/hdd/support/7k500/7k500.htm • http://www.seagate.com/docs/pdf/datasheet/disc/ds_cheetah10k.7.pdf

;�$$2����� ���������������� ����� ������ ��� ��� �"���������������-�<�������������������-��

������ ���������������� ������������������4����=����������� "�� ��� �� ������ ���

� ����� ����������� ����������������&��4,4��� ��� ����� ���������� ���������������� ����

����> " ��������� ���-�� ��� �(�� ����������?�������� �� ������ ���� ����� ������(�� ������

)� �� ����� �"0������������ ������ ���� ����� �����"0�������)� �� ����� ��������� �� ������ �

� ��-����� ��� ����� ���� � ������ ���� ����� ���������� � ��������������� ���������-������

� ����������-�� �