Intel® Expressway Service Gateway Web Service/API...

12
Intel® Expressway Service Gateway Web Service/API Security Performance Comparison to IBM* DataPower XI50 Take advantage of continuous chip optimization improvements such as XEON E5-2600 with the easily upgradeable software appliance form factor. Intel® Expressway Service Gateway outpaces IBM DataPower by 6X to 10X in a direct “apples to apples” comparison. Appliance server hardware utilized Intel’s new XEON processor E5-2600 for ground breaking security performance WHITE PAPER Service Gateway Intel® XEON Optimized Performance Comparison Report

Transcript of Intel® Expressway Service Gateway Web Service/API...

Page 1: Intel® Expressway Service Gateway Web Service/API …info.intel.com/rs/intel/images/Comparison_to_IBM_DataPower_XI50.pdf · Intel® Expressway Service Gateway Web Service/API Security

Intel® Expressway Service Gateway Web Service/API Security Performance Comparison to IBM* DataPower XI50

Take advantage

of continuous chip

optimization improvements

such as XEON E5-2600

with the easily upgradeable

software appliance form

factor. Intel® Expressway

Service Gateway outpaces

IBM DataPower by 6X to

10X in a direct “apples to

apples” comparison.

Appliance server hardware utilized Intel’s new XEON processor E5-2600 for ground breaking security performance

WHITE PAPERService GatewayIntel® XEON Optimized Performance Comparison Report

Page 2: Intel® Expressway Service Gateway Web Service/API …info.intel.com/rs/intel/images/Comparison_to_IBM_DataPower_XI50.pdf · Intel® Expressway Service Gateway Web Service/API Security

Brand Note: Intel® Expressway Service Gateway is available direct from Intel or branded as McAfee® Services Gateway when procured as part of the McAfee Cloud Security Platform suite. Hereafter, the paper will refer to the Intel short name “Intel® ESG”.

Executive Summary Hardware appliances such as the IBM* DataPower XI50 are typically deployed in datacenters to solve integration problems around SOA and REST security, governance and mediation. While these types of highly specialized hardware appliances may solve a few specific problems, they are slow to evolve, difficult to repurpose, and incur higher maintenance and operational costs as compared to general purpose server software solutions. Furthermore, hardware appliances are at odds with major cost-saving datacenter trends such as multi-core and virtualization. As financial constraints on IT become more severe, new products are needed to avoid the compromise inherent in a hardware appliance between performance, and the flexibility and extensibility otherwise available in software solutions. In this new environment, a multi-core optimized software appliance or service gateway is superior in offering high performing security, governance, integration, and mediation capabilities at a lower total cost.

Intel’s Application Security and Identity Products group has released Intel® ESG, a highly tuned, multi-core optimized software service gateway that provides the full security (runtime governance) , integration and mediation functionality with a modern graphic design interface It makes application deployment on multi-core environments significantly more scalable and manageable. Designed to be easily managed, just like hardware appliances, Intel®’s Expressway Service

Gateway is deployable in minutes rather than days and requires zero coding for common service intermediary usage patterns. Intel® ESG offers mediation capabilities that can deal with heterogeneous service and legacy environments and different messaging patterns required in service interactions. Intel® ESG also offers common shared security enforcement for XML threat protection (XML Firewall), authentication, authorization, access control and auditing.

In this paper we provide:

• A description of the Intel® ESG product and architectural elements and how they are taking advantage of continuously evolving Intel processors, platforms and technologies in delivering unparalleled performance.

• Results demonstrating the superior performance of Intel® ESG as compared to IBM DataPower XI50 for service mediation and security (runtime governance) use cases.

• A cost comparison showing the advantage of deploying Intel® ESG as compared to the IBM DataPower XI50.

Table of Contents

Introduction . . . . . . . . . . . . . . . . . . . . . . . .1

Executive Summary . . . . . . . . . . . . . . . . .2

1. Expressway Service Gateway Overview . . . . . . . . . . . . . . . . . .3

2. Performance Architecture . . . . . . . .3

2.1 Intel® Multi-Core Optimizations . . . . . . . . . . . . . . . . . . . . . .3

2.2 Efficient XML Processing . . . . . . .3

2.3 Efficient Service Mediation Engine . . . . . . . . . . . . . . . . . .3

3. Next Generation Service Gateway Architecture . . . . . . . . . . . . . . .3

4. Performance Testing . . . . . . . . . . . . .4

4.1 Methodology . . . . . . . . . . . . . . . . . . .4

4.2 Test Environment . . . . . . . . . . . . . .5

4.3 Test Execution and Setup . . . . . .5

4.3.1 Service Mediation . . . . . . . . . .6

4.3.2 Service Governance . . . . . . . .6

4.3.3 Traffic Pass-Through . . . . . . .6

5. Test Results . . . . . . . . . . . . . . . . . . . . . .7

5.1 Traffic Pass-Through . . . . . . . . . . .7

5.2 Service Mediation Test Results . . . . . . . . . . . . . . . . . . . . . . .8

5.3 Service Governance Test Results . . . . . . . . . . . . . . . . . . . . . . .9

6. Cost Comparison . . . . . . . . . . . . . . . 10

Conclusion . . . . . . . . . . . . . . . . . . . . . . . . .11

About Expressway Service Gateway . . . . . . . . . . . . . . . . . . .11

Intel® ESG provides superior performance for demanding security (runtime governance), and mediation use cases outpacing IBM DataPower in a direct “apples-to- apples” comparison by 6X to 10X at a fraction of the total cost.

2

Intel® Expressway Services Gateway Performance Comparison

2

Page 3: Intel® Expressway Service Gateway Web Service/API …info.intel.com/rs/intel/images/Comparison_to_IBM_DataPower_XI50.pdf · Intel® Expressway Service Gateway Web Service/API Security

1. Intel® Expressway Service Gateway OverviewIntel® ESG is a software-appliance designed to simplify and secure application architecture on-premise or in the cloud. It expedites deployments by addressing common security and performance challenges – it accelerates, secures, integrates and routes XML, web services and legacy data in a single, easy to manage software appliance form factor. It pro vides the following benefits:

Secure the Edge: The static network perimeter is gone. Regain control with a centralized policy enforcement point to authenticate, authorize, and govern service interactions with customers, partners, employees, and cloud providers (IaaS/PaaS) as they consume or deploy applications.

Build Dynamic Apps: Innovate faster, gain a competitive edge, and securely expose mass customizable applications on-premise or in the cloud, regardless of the abstraction pattern (SOA, WOA), delivery method (App Store, Cloud) or protocol (REST, SOAP).

Get More…Go Neutral: Break free of purpose-built inflexible appliances tied to expensive vendor suites. When it comes to lower cost, multi-core optimized performance, ready middleware interoperability and vendor viability, protect your investment with Intel.

Flexibility without Compromise: Simplify infrastructure by deploying on standard Intel® Multi-Core servers. Address common SOA/REST service and security bottlenecks with specialized soft-appliances without compromising extensibility or virtualization.

Rapid Deployment: Decoupled design-time (Services Designer) increases security for enterprise roles and allows customers to have multiple copies of the design environment without extra gateway purchases. The designer supports rapid prototyping of application execution without requiring a gateway instance and utilizes standard Eclipse which runs on low-cost laptops and hardware.

The technology behind Intel’s ESG is based on 10-years of research and development both within Intel and through the acquisition of Sarvega, Inc and has been field tested in some of the largest security and performance-sensitive deployments of distributed SOA and REST design patterns.

2. Performance Architecture2.1 Intel® Multi-Core Optimizations

Intel® ESG provides a software engine that is highly optimized for Intel Multi-Core architectures, including Intel® Xeon 5500, 5600 and the newest record-breaking Xeon E5 based servers – that deliver leadership performance, best data center performance per watt, and break through I/O innovation. Intel® ESG helps bridge the gap between multi-core optimizations and business software by providing a runtime layer that efficiently scales with the Intel processor roadmap, bringing Moore’s law to business computing without requiring a heavy investment in custom programming.

2.2 Efficient XML Processing

Efficient XML Processing Intel ESG provides support for XML acceleration by building a number of highly optimized XML processing engines using an efficient binary representation called EventStream (ES). ES scales effectively with Intel Multi-Core architectures and platform enhancements without any special hardware. With this set of XML processing engines, ESG executes XML operations such as XPath evaluation, XSLT, XML Schema validation, XML-Content attack filtering, XML content-based routing, and XML/WS-Security (including canonicalization) at wire speed, removing traditional XML processing bottlenecks. Intel is continuously evolving its EventStream architecture to provide increased scalability with the latest multi-core platforms. Finally, because Expressway doesn’t depend on special hardware, it can be virtualized to achieve the same performance as a dedicated hardware server.

2.3 Efficient Service Mediation Engine

While XML optimizations form an important cornerstone of Intel® ESG’s performance characteristics, they are only a portion of the performance equation. Aside from point optimizations such as faster transforms (XSLT) or faster schema validation (XSD), the real problem is finding a good way to mediate between the services that form the underlying components of the larger service or distributed application. For instance, any gain in XML acceleration may be offset by inefficiency in the many layers involved in a large service application with respect to application data I/O, either between local or remote services. To solve this overhead problem, Intel® ESG offers a light-weight service mediation engine that interprets message policies in byte-code form with a zero-copy data interface. This design allows for efficient scaling across multiple threads and allows Intel® ESG to offer predictable, linear scalability per core for thousands of incoming connections.

3. Next Generation Service Gateway ArchitectureThe performance numbers and traffic behavior are a result of a continuously optimized Expressway runtime architecture. The new architecture in concert with the increased performance of the Xeon E5 class architecture resulted in considerable performance gains, eclipsing the previous architecture by well over 200% in both security and mediation tests.

With the new Xeon E5-2600 series featuring a significant performance improvement and an appreciably smaller level of power consumption, Intel continues to be on the forefront of addressing the unprecedented demand for efficient, secure, and high-performing datacenter infrastructure.

Reduced Jitter under Heavy Client Load

The newly optimized architecture boasts a number of performance improvements that apply to Expressway running on all Intel® Xeon servers. Performance optimizations include enhancements for XML processing leveraging the Intel SSE4.2 instructions, parallel execution, memory management and workflow optimizations.

3

Intel® Expressway Services Gateway Performance Comparison

Page 4: Intel® Expressway Service Gateway Web Service/API …info.intel.com/rs/intel/images/Comparison_to_IBM_DataPower_XI50.pdf · Intel® Expressway Service Gateway Web Service/API Security

evaluation and an improved copyTree() internal function for faster in-memory node copying.

• Code Refinements: The ES-III data structure and processing library includes further code refinements including the optimization of the remove operation by using ESII_LINK, removal of redundant checks and unnecessary decoding. These improvements result in higher performance and better memory usage on multi-core environments.

The next generation architecture improves the underlying XML representation, raw TPS and latency and also improves the overall traffic pattern and behavior. For example, the graphs show in here display a noticeable reduction of jittering in both message rate and latency. The results obtained here display very ‘smooth’ curves that represent highly predictable processing behavior for throughput (messages per second), latency (ms), and CPU utilization. The numbers and curves show the progression and evolution of the Expressway product. Notice how the curves smooth out and become more predictable as the software evolves.

4. Performance Testing4.1 Methodology

Our testing methodology involves use cases with real-world applicability rather than synthetic benchmark that only highlight a few peak numbers. As such, we have designed an environment to tests three aspects of each service intermediary: (a) Messages per Second, (b) Latency, and (c) CPU Utilization. The first metric measures how many messages/transactions the intermediary can process per second and is sometimes abbreviated MPS or TPS. The second metric is the time interval in which a client request is received and responded, and the third metric is an indicator of how much headroom is remaining on the system – this is commonly presented as a percentage as either CPU idle or CPU used.

• Copy-on-Write (Copy Avoidance): Optimization in document storage lifecycle management to enable copy-on-write semantics and aggressive garbage-collection thereby reducing processing overhead.

• Smart Processing Semantics: Implementation of smart processing semantics in the workflow engine to eliminate redundant data processing thereby reducing processing overhead.

• Advanced Intel Architecture (IA) Support: Automatic detection and dynamic selection of advanced IA instructions which results in reduced processing overhead and maximal use of CPU capabilities.

Efficient Binary Representation using Event-Stream-III (ES III):

Continuing evolution of the patented XML Event Stream technology pioneered by Sarvega, Event Stream III further advances XML performance through structural and algorithmic optimizations based on data obtained from profiling and analysis of real world applications. Additionally, it leverages Intel SSE4.2 instructions available in the Intel Architecture (IA) CPUs for hardware accelerated XML processing. As a result, it provides faster XML Event Stream building, faster element iteration, and faster in-place modification than its predecessor Event Stream II. Some of the improvements to ES III include:

• Accelerated iterator processing and event stream building: Support for fixed-length (16 bytes) record sizes and separation of text information and structural information provides better memory efficiency.

• Accelerated update operation: Reduces “trunk management” on the stream and merges the main and variable stream storage locations. Additional improvements include lazy namespace scope resolution for faster variable

Optimized Parallel Execution to deliver linear scalability with the number of CPU cores:

• Thread Handling: Improved thread handling for client input that improves CPU utilization and thread-to-core matching, allowing the application to more fully utilize the available cores.

• Cache Line Eviction: Optimizations to avoid false sharing of the cache line resulting in the reduction of cache line eviction which leads to better processing efficiency in a multi-core environment.

• Exponential Back-off: Implementation of exponential-back-off in spin lock contention resolution which results in increases parallelism in multi-core environments.

• Intel Architecture (IA) Lock Synchronization Primitives: Refinement of locking strategy to take the advantage of IA synchronization primitives over standard operating system synchronization primitives thereby increasing parallelism in multi-core environment.

Memory Management to address memory contention with larger core counts:

• Scalable Memory Allocation: Reduced memory contention and switches from user space to kernel space through scalable memory allocation.

• Local Memory Allocator: Implementation of per thread local memory allocator for small memory objects to reduce global allocator contention which results better parallelism in multi-core environments.

Workflow ( Policy Execution ) Optimizations to reduce processing overhead:

• Object Local Cache: Implementation of heavyweight object local cache to eliminate creation and destruction overhead thereby reducing processing overhead.

4

Intel® Expressway Services Gateway Performance Comparison

Page 5: Intel® Expressway Service Gateway Web Service/API …info.intel.com/rs/intel/images/Comparison_to_IBM_DataPower_XI50.pdf · Intel® Expressway Service Gateway Web Service/API Security

Backend ServerClient Machine

ServiceIntermediary

While a very high peak number may look impressive, what really matters is the interplay between these three metrics to achieve a balanced, predictable and stable behavior under heavy load. It is not enough to look at a single metric without considering the other metrics. For instance, the total throughput (TPS) must be qualified against CPU utilization – if the CPU utilization is low, it may be possible to increase throughput further by pushing more clients threads with no degradation in perceived latency. Also, if CPU utilization is high or nearly maxed out we would expect to see predictable latency (the time taken to complete a request) to increase as the load increases. It is important for the Service Intermediary to be capable of graceful handling of heavy client loads while maximizing the CPU utilization with predicable behavior. In other words, we expect latency to increase but the system to remain stable. Similarly, CPU utilization can be compared against latency. If two products have similar latencies between requests and one has more CPU headroom, the product with more headroom may be able to handle increased throughput while maintaining client service-levels.

To achieve this balance, the test environment was designed to minimize network overhead while at the same time ramping up the number of requesting threads in a gradual manner to determine the optimal balance for all three metrics.

4.2 Test Environment

The test environment was constructed as indicated in Figure 1.

The service intermediary represents the product under test, which was either the IBM DataPower XI50 or one of the versions of Intel® ESG as described in Table 1.

The tests were run on an isolated network using a 1Gbit network switch connecting the client machine, intermediary, and back-end server. The client is a custom-

Figure 1: Environment Topology

INTEL® ESG vERSION

CPU CONFIGURATION MEMORy MANUFACTURER

OPERATING SySTEM

V2.0 Dual Socket Intel® Xeon® CPU X5560 @ 2.8Ghz

6x2GB 1333 DDR3 RDIMMs (12G total)

Supermicro Red Hat Linux AS4

V2.0 Dual Socket Intel® Xeon® CPU E5450 @ 3.00GHz

4x2GB FBD PC2-5300 (8G total)

HP ProLiant DL360 G5

Red Hat Linux AS4

V2.7 Dual Socket Intel® Xeon CPU E5 @ 2.30GHz

4x8GB (32GB total)

Supermicro Red Hat Linux AS5

Table 1

built generic HTTP client designed to scale to a large number of concurrent client connections and the back-end server is a custom-built generic HTTP web server also designed for low latency and high volumes. Both the client and server are designed to be efficient in memory and CPU usage in order to prevent bottlenecking. For example, the web server simulates a service response and returns a predefined canned response – no actual work or service call is executing on the backend server.

4.3 Test Execution and Setup

In order to gauge the behavior of each system under load, we used ramping style tests that measured the latency and throughput at various concurrent

client connection levels spanning from low single digits to a maximum of 450 threads. The IBM DataPower appliance we configured a single XML Firewall2 application with the application probe disabled. For Intel® ESG we configured a workflow application using the Intel® Services Designer GUI. No special tuning was done on the Intel® ESG and in both cases each product used one NIC interface for the client side and another for the backend server side.

The next two sections describe the processing steps for the two main cases: service mediation and service governance.

5

Intel® Expressway Services Gateway Performance Comparison

Page 6: Intel® Expressway Service Gateway Web Service/API …info.intel.com/rs/intel/images/Comparison_to_IBM_DataPower_XI50.pdf · Intel® Expressway Service Gateway Web Service/API Security

4.3.1 Service Mediation

The service mediation use case involves a SOAP v1.1 purchase order message sent over HTTP that undergoes XML schema validation, XSL transformation and endpoint routing. The transformation alters the value a single node (shipping address) in the purchase order to simulate service mediation from one format to another. The response from the back-end server is then schema validated for grammar correctness and returned to the calling client. Figure 2 shows a graphical illustration of the service mediation pipeline.

4.3.2 Service Governance

The service governance use case involves applying a security policy at runtime on the response leg. In this case, the same

SOAP v1.1 purchase order was sent to the intermediary for schema validation. The response from the back-end server includes a schema validation step, a WS-Security signature over the SOAP Body and then a WS-Security encryption step over the SOAP body. The signature algorithm used was 1024-bit RSA w/SHA-1 and the encryption algorithm was 3DES-CBC with a 1024 bit RSA key-wrap. Figure 3 shows a graphical illustration of the service governance pipeline.

4.3.3 Traffic Pass-Through

In addition to the two aforementioned tests, a third “pass-through” or service virtualization test (see Figure 4) was also conducted to determine peak pass-through rates. In a pass-through test the job of the service intermediary is

to act as a pure HTTP message proxy. That is, it receives the SOAP v1.1 request and passes it along to the back-end server with no additional processing. To configure a pass-through test the IBM DataPower was configured using a manual front-side HTTP handler (no WSDL) and the Intel® ESG was configured with a receive, invoke and reply policy pipeline. A pass-through test is a good baseline to determine the upper bound on the HTTP and network processing before additional processing steps are applied. In addition to basic service firewall protection, the pass-through scenario is useful in determining the potential of the connection handling and processing inherent in each product.

Service ProviderService Client

HTTP HTTP

Service ProviderService ClientXSD ValidationXSD Validation WS-Security/Sign

XSDValidationHTTP HTTP

Figure 3: Service Governance Workflow

Figure 4: Pass Through Workflow

Service ProviderService Client

XSDValidation XSLT XSD

Validation

XSDValidation

HTTP HTTPRouting

Figure 2: Service Mediation Workflow

6

Intel® Expressway Services Gateway Performance Comparison

Page 7: Intel® Expressway Service Gateway Web Service/API …info.intel.com/rs/intel/images/Comparison_to_IBM_DataPower_XI50.pdf · Intel® Expressway Service Gateway Web Service/API Security

100.00

90.00

80.00

70.00

60.00

50.00

40.00

30.00

20.00

10.00

0.00

52.00

25.29

40.76 CPU

(%)

Product Configuration

Intel® Xeon® CPU X5560 @ 2.80GHZ 360-Client

Intel® Xeon® CPU X5560 @ 2.80GHZ 60-Client

Intel® Xeon® CPU E5 @ 2.30GHZ 60-Client

Intel® Xeon® CPU E5 @ 2.30GHZ 360-Client

Intel® Xeon® CPU X5450 @ 3.00GHZ 240-Client

Intel® Xeon® CPU X5450 @ 3.00GHZ 40-Client

Intel® Xeon® CPU X5450 @ 3.00GHZ 10-Client

DP XI50 10-Client

66.27

77.37

84.11 90.00 90.00

5. Test ResultsThe next three sections describe the throughput, latency and CPU utilization results for the pass-through test.

5.1 Traffic Pass-ThroughFigure 5 demonstrate the baseline proxy performance for each product at a different number of client threads. In all cases, the Intel® ESG was able to handle considerably more pass-through traffic. In the extreme case of the Intel® Xeon® E5 platform, Intel® ESG was able to handle more than 10 times the throughput as compared to the IBM DataPower XI50. It is natural to ask why IBM DataPower was only tested with 10 client threads in this case. We observed that the CPU utilization (shown in Figure 6) was already at 90% and furthermore, errors were observed with thread counts higher than ten3.

Next we look at the latency between requests in the pass-through case. As shown in Figure 6.

Figure 6 shows the latency comparisons for the baseline pass-through case. In these graphs the latency is the wait time as perceived by each worker thread. Here we can see that once again, the IBM DataPower XI50 is only able to scale to 10 threads while Intel® ESG on the Intel Xeon X5560 and Xeon E5 are handling 6 times the traffic at a lower latency. Moreover, we can see the difference in latency with the new Expressway runtime architecture and Xeon E5 platform: at high thread levels the latency has been cut by more than half. Intel® ESG also shows the lowest latency in the 10 client test as well, demonstrating over one-third the wait time as compared to the IBM DataPower XI50. The next data point is CPU utilization for each product, which is shown as follows in Figure 7.

Figure 7 shows how much CPU is utilized during the pass-through tests. Here we can see that the available headroom on the IBM DataPower XI50 product is nearly exhausted at 10 clients and 3600 TPS while Intel® ESG still has nearly half its potential headroom available in the case of the Intel Xeon 5560 Platform at 60 clients. The performance is even more efficient with the Xeon E5 – even at 360 threads the CPU is still idle about a quarter of the time. Overall, the DataPower product shows the lowest available headroom at the lowest number of threads, indicating it is nearly maxed out. In the next section we will look at how each product scales over time for both the service mediation and service governance use cases.

40000.00

35000.00

30000.00

25000.00

20000.00

15000.00

10000.00

5000.00

0.00

38353.43

28359.44

TPS

Product Configuration

Intel® Xeon® CPU X5560 @ 2.80GHZ 360-Client

Intel® Xeon® CPU E5 @ 2.30GHZ 60-Client

Intel® Xeon® CPU E5 @ 2.30GHZ 360-Client

Intel® Xeon® CPU X5560 @ 2.80GHZ 60-Client

Intel® Xeon® CPU X5450 @ 3.00GHZ 240-Client

Intel® Xeon® CPU X5450 @ 3.00GHZ 40-Client

Intel® Xeon® CPU X5450 @ 3.00GHZ 10-Client

DP XI50 10-Client

Intel® Xeon® CPU X5560 @ 2.80GHZ 10-Client

30432.49

22594.33

17075.49

13333.85

11126.009418.00

3618.18

16.00

14.00

12.00

10.00

8.00

6.00

4.00

2.00

0.00

0.77

Late

ncy

(m)

Product Configuration

Intel® Xeon® CPU X5560 @ 2.80GHZ 360-Client

Intel® Xeon® CPU X5560 @ 2.80GHZ 60-Client

Intel® Xeon® CPU X5450 @ 3.00GHZ 240-Client

Intel® Xeon® CPU X5450 @ 3.00GHZ 40-Client

Intel® Xeon® CPU X5450 @ 3.00GHZ 10-Client

DP XI50 10-Client

Intel® Xeon® CPU X5560 @ 2.80GHZ 10-Client

0.93 1.692.48

2.58 2.84

5.47

12.15

13.93

Figure 5: Pass-Through TPS Intel® Expressway Service Gateway vs. IBM DataPower XI50

Figure 6: Latency Intel® Expressway Service Gateway vs. IBM DataPower XI50

Figure 7: CPU Utilization Intel® Expressway Service Gateway vs. IBM DataPower XI50

Intel® Xeon® CPU E5 @ 2.30GHZ 60-Client

Intel® Xeon® CPU E5 @ 2.30GHZ 360-Client

7

Intel® Expressway Services Gateway Performance Comparison

Page 8: Intel® Expressway Service Gateway Web Service/API …info.intel.com/rs/intel/images/Comparison_to_IBM_DataPower_XI50.pdf · Intel® Expressway Service Gateway Web Service/API Security

5.2 Service Mediation Test Results

The following figures compare the performance of the IBM DataPower XI50 to Intel® ESG running on both Intel platforms for service mediation. Each section shows a plot over time comparing the TPS, latency and idle CPU as the number of client threads increase.

Figure 8 shows the throughput measured messages per second (TPS) as the number of client threads increases. Here we can see Intel® ESG on the E5450 platform hitting a peak throughput of about 6200 TPS and about 9600 TPS on the X5560 platform before leveling off. If we compare the previous Expressway architecture to the next generation software on the Xeon E5 we can see the numbers peak at about 18000 TPS. The leveling off behavior indicates that throughput levels are being maintained at the expense of latency (see Figure 8). That is, despite the number of clients, Intel® ESG is behaving predictably. Moreover, the smooth logarithmic behavior of the next-generation Expressway architecture contrasts some of the peaking seen in the previous generation software. This contrasts the IBM DataPower XI50, which hits its peak at about ~2600 TPS before degrading to just under 1000 TPS with heavy load. In this case, the IBM DataPower XI50 is unable to scale to the higher thread levels.

Figure 9 shows the latency in milliseconds between client requests for each product. Here we can see that Intel® ESG has linear behavior with respect to the time between requests, even under heavy loads. Intel® ESG’s slight initial advantage in latency quickly expands over IBM DataPower XI50 as the client threads increase. The IBM DataPower XI50 begins with steeper linear behavior, but this turns erratic at about 200 threads, indicating difficulty in scaling stably at higher thread levels. It is also important to note that on the faster Intel platform (X5560) the latency slope is more gradual, indicating that Intel® ESG is scaling with high predictability and stability based on platform improvements alone. Comparing further, we can see that the predictability and linear behavior continues its flattening trend with the next generation architecture on the Xeon E5 hardware as the line approaches a progressively smaller slope.

Figure 10 shows the percentage of available CPU as the number of client threads increase. Here we can see that Intel® ESG (E5450) and IBM DataPower XI50 are about on-par for available CPU (around 10%) for the first 200 threads. As the threads increase, however, IBM DataPower begins to have scalability problems. Even though more of the CPU becomes available it is unable to process the increased number of client requests and maintain throughput levels (See Figures 8 and 9). For Intel® ESG on the Intel X5560 platform we can see a large increase in available CPU once again due to Intel software and platform improvements. The next generation architecture and E5 platform make the CPU behavior even more dramatic. The smooth logarithmic curve indicates that the updated runtime architecture is able to utilize the available processing power much more efficiently. This can be seen around thread 200 where the previous architecture remains static but the updated architecture on the Xeon E5 makes increased usage of the available processing power.

350

300

250

200

150

100

50

0

Thread Count

0 50 100 150 200 250 300 350 400 450

Late

ncy

(ms)

Thread Count

0 50 100 150 200 250 300 350 400 450

20000

18000

16000

14000

12000

10000

8000

6000

4000

2000

0

Thro

ughp

ut (T

PS)

Thread Count

Idle

CP

U %

0 50 100 150 200 250 300 350 400 450

Intel® ESG (X5560)Intel® ESG (E5450)DataPower (XI50)Intel® ESG (Xeon E5)

80

70

60

50

40

30

20

10

0

Figure 8: Throughput Intel® Expressway Service Gateway vs. IBM DataPower XI50

Figure 9: Latency Intel® Expressway Service Gateway vs. IBM DataPower XI50

Figure 10: Idle CPU Intel® Expressway Service Gateway vs. IBM DataPower XI50

8

Intel® Expressway Services Gateway Performance Comparison

Page 9: Intel® Expressway Service Gateway Web Service/API …info.intel.com/rs/intel/images/Comparison_to_IBM_DataPower_XI50.pdf · Intel® Expressway Service Gateway Web Service/API Security

5.3 Service Governance Test Results

The following figures compare the performance of the IBM DataPower XI50 to Intel® ESG running on both Intel platforms for service governance (security). Each section shows a plot over time comparing the TPS, latency and idle CPU as the number of client threads increase. It should be noted that the service governance scenario is considerably more taxing to each system due to the addition of security operations. While the absolute throughput numbers are expected to be lower for all the systems under test, the overall behavior under heavy client load should remain consistent with the service mediation case.

Figure 11 shows the throughput measured in messages per second (TPS) as the number of client threads increase. Here we can see Intel® ESG on the E5450 platform hitting a peak throughput of about 2030 TPS and about 3700 TPS on the X5560 platform before leveling off. The next generation Expressway architecture on the Xeon E5 platform hits a peak of 8000 TPS, which is more than double the previous generation architecture and hardware. This contrasts the IBM DataPower XI50 which achieves a peak throughput of 1280 TPS before leveling off and then eventually degrading under heavy load. It should be noted that we obtained the lower bound performance advantage of about 6X by considering the sustained throughput of Intel® ESG on the Xeon E5 platform as compared to IBM DataPower XI50.

Figure 12 shows the latency in milliseconds between client requests for each product. Here we can see that Intel® ESG has linear behavior with respect to the time between requests, even at heavy loads. In the latency comparison between Intel® ESG on the E5450 and X5560 it is also important to see that the difference in slope is larger for the service governance case as compared to the latency slope shown in Figure 8. This suggests that platform improvements in the X5560 are more applicable to the service governance case and that Intel® ESG is once again successfully scaling on the latest Intel platforms. The updated runtime architecture on the Xeon E5 takes this even one step further. Figure 12 shows a more progressive flattening of the slope, indicating lower latency over time, a direct result of the advanced software architecture more fully utilizing Intel® Multi-Core optimizations.

Figure 13 shows the percentage of available CPU as the number of client threads increase. While the DataPower XI50 has more available CPU than Intel® ESG on the E5450, it is unable to make use of the extra processing power to scale its throughput or reduce its latency (see Figures 11 and 12). Moreover, the spike in free CPU for DataPower is due to excessive errors which begin at thread 169. In this case, the DataPower XI50 is unable to handle the increased number of threads and begins to lose requests; in other words, the spike in idle CPU is a symptom of overload rather than an indicator of increased headroom. Also, in the case of Intel® ESG on the X5560 we can see a dramatic increase in the amount of idle CPU, which is further improved by the next generation runtime architecture on the Xeon E5.

9000

8000

7000

6000

5000

4000

3000

2000

1000

0

Thread Count

0 50 100 150 200 250 300 350 400 450

Thro

ughp

ut (T

PS)

Idle

CP

U %

Thread Count

0 50 100 150 200 250 300 350

80

70

60

50

40

30

20

10

0

350

300

250

200

150

100

50

0

Thread Count

0 50 100 150 200 250 300 350 400 450

Late

ncy

(ms)

Figure 11: Throughput Intel® Expressway Service Gateway vs. IBM DataPower XI50

Figure 12: Latency Intel® Expressway Service Gateway vs. IBM DataPower XI50

Figure 13: Idle CPU Intel® Expressway Service Gateway vs. IBM DataPower XI50

Intel® ESG (X5560)Intel® ESG (E5450)DataPower (XI50)Intel® ESG (Xeon E5)

9

Intel® Expressway Services Gateway Performance Comparison

Page 10: Intel® Expressway Service Gateway Web Service/API …info.intel.com/rs/intel/images/Comparison_to_IBM_DataPower_XI50.pdf · Intel® Expressway Service Gateway Web Service/API Security

6. Cost ComparisonThe following table shows an initial upfront cost comparison for Intel® ESG versus IBM DataPower XI50 for a typical deployment consisting of 2 units deployed at 2 different physical locations, each with a single hot-standby unit (4 units total).

PRODUCT PER-UNIT PRICEPER UNIT MAINTENANCE & SUPPORT

TOTAL UPFRONT COST (4 UNITS)

BM* Websphere DataPower XI507 Total Price

$100,000 $9,750 $439,000

Intel® Expressway Service Gateway Production Units8 $64,000 $12,600

Intel® Expressway Service Gateway Hot Standby Units $34,000

Intel® Expressway Service Gateway

Total Price

$5,800 $232,800

Table 2: Initial Cost Comparison

While discounting may apply in both cases, Intel® ESG’s list price is only about 50% of the IBM DataPower XI50, despite its significant performance advantage. Moreover, since Intel® ESG runs on standard Intel servers, it can be deployed on existing server hardware or virtual machines in the datacenter rather than

requiring the acquisition of purpose-built appliances. Additionally, the Intel® ESG soft appliance form factor allows datacenter to update server hardware independently riding on Intel’s hardware performance evolution while maintaining system architecture continuity.

10

Intel® Expressway Services Gateway Performance Comparison

Page 11: Intel® Expressway Service Gateway Web Service/API …info.intel.com/rs/intel/images/Comparison_to_IBM_DataPower_XI50.pdf · Intel® Expressway Service Gateway Web Service/API Security

ConclusionThis detailed “apples-to-apples” comparison demonstrates that the Intel® ESG soft appliance running on Intel Multi-Core servers easily outperforms a custom built hardware appliance on real-world use cases. Not only can Intel® ESG achieve higher “peak” throughput, it also demonstrates predictability, linear scalability, and the ability to take advantage of Intel microprocessor advancements and apply them directly to business processing without any specialized coding or custom configurations. We demonstrated a lower bound performance metric of 6X on the service governance use case and an upper bound throughput metric of 10X on the pass-through use case, with an average advantage of 6X improvement at different thread levels. This clear advantage coupled with the lower cost and increased flexibility of a software solution makes Intel® ESG a compelling choice to increase performance and reduce costs and complexity for service integration in the datacenter.

About Intel® Expressway Service GatewayIntel® ESG (also available as McAfee Services Gateway under the McAfee Cloud Security Platform suite) is a software-appliance designed to simplify and secure application architecture on-premise or in the cloud. It expedites deployments by addressing common security and performance challenges – it accelerates, secures, integrates and routes XML, web services and legacy data in a single, easy to manage software appliance form factor.

About Intel Xeon Processor E5-2600 seriesThe Intel Xeon processor E5-2600 is record breaking, delivering leadership performance, best data center performance per watt and break-through I/O innovation. It addresses the unprecedented demand for efficient, secure, and high-performing datacenter infrastructure. It features up to an 80% performance boost versus prior-generation processors at consistent-power levels with an improved energy efficient performance of up to 50%.

11

Intel® Expressway Services Gateway Performance Comparison

Page 12: Intel® Expressway Service Gateway Web Service/API …info.intel.com/rs/intel/images/Comparison_to_IBM_DataPower_XI50.pdf · Intel® Expressway Service Gateway Web Service/API Security

For product comparison information: http://www.intel.com/go/identity

Americas: 1-800-253-3696 All other Geographies: 1-978-948-2585

Contact us by email: [email protected]

INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL® PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL’S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER, AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. UNLESS OTHERWISE AGREED IN WRITING BY INTEL, THE INTEL PRODUCTS ARE NOT DESIGNED NOR INTENDED FOR ANY APPLICATION IN WHICH THE FAILURE OF THE INTEL PRODUCT COULD CREATE A SITUATION WHERE PERSONAL INJURY OR DEATH MAY OCCUR.

Intelmaymakechangestospecificationsandproductdescriptionsatanytime,withoutnotice.Designersmustnotrelyontheabsenceorcharacteristicsofanyfeaturesorinstructionsmarked“reserved”or“undefined.”Intelreservestheseforfuturedefinitionandshallhavenoresponsibilitywhatsoeverforconflictsorincompatibilitiesarisingfromfuturechangestothem.Theinformationhereissubjecttochangewithoutnotice.Donotfinalizeadesignwiththisinformation.

Theproductsdescribedinthisdocumentmaycontaindesigndefectsorerrorsknownaserratawhichmaycausetheproducttodeviatefrompublishedspecifications.Current characterizederrataareavailableonrequest.ContactyourlocalIntelsalesofficeoryourdistributortoobtainthelatestspecificationsandbeforeplacingyourproductorder.Copies ofdocumentswhichhaveanordernumberandarereferencedinthisdocument,orotherIntelliterature,maybeobtainedbycalling1-800-548-4725,orbyvisitingIntel’sWebsite atwww.intel.com.

Copyright©2012IntelCorporation.Allrightsreserved.Intel,theIntellogo,andXeonaretrademarksofIntelCorporationintheU.S.andothercountries.*Othernamesandbrandsmaybeclaimedasthepropertyofothers. PrintedinUSA PleaseRecycle324094-004US

1IBM®DataPowerXI50FirmwareRev:XI50.3.6.1.5,Build:1562622Forteststhatrequiredaprivatekey,weensuredthekeywasloadedintotheHSM(HardwareSecurityModule)toobtainthebestperformanceratherthanflashmemory.3Wefoundproblemswithhighernumbersofthreadswithpersistentconnectionsenabled.DisablingpersistentconnectionsallowedtheIBMDataPowerXI50producttoprocessalargernumberofworkerthreadswithouterrors,butatareducedthroughputrate.

4ProcessingerrorswerereportedbytheIBMDataPowerXI50fortheservicemediationcasewhenthreadlevelsreached230.IntelSOAExpresswayproducedzeroerrors.5ProcessingerrorswerereportedbytheIBMDataPowerXI50fortheservicemediationcasewhenthreadlevelsreached169.IntelSOAExpresswayreportedzeroerrors.6Ifweassumeanaverage8084TPSforExpresswayusingtheupdatedarchitectureontheXeonE5ascomparedto1280TPSfortheIBMDataPowerXI50theratioisapproximately6.31X7Source:IBMWebsphereDataPowerSOAApplianceAnnouncement,March2006(http://www.ibm.com/common/ssi/rep_ca/3/897/ENUS106-253/ENUS106253.PDF) 8Thispriceincludesthepriceofasuitablyconfiguredserver(IntelXeonE5450class)aswellasa3yearenterpriseLinuxlicensewithsupport.

•Intel®SOAExpresswayv2.0onIntelXeon –DualProcessorIntel® Xeon®CPUX5560@ 2.80GHzNehalem-EP –Manufacturer:Supermicro –Memory:6x2GB1333DDR3RDIMMs(12Gtotal) –OperatingSystem:RedHatLinuxAS4

•Intel®SOAExpresswayv2.0onHPProLiant –Platform:HPProLiantDL360G5 –DualProcessor –Manufacturer:HP –Memory:4x2GBFBDPC2-5300(8Gtotal)

•IBM*DataPowerXI50 –FirmwareRev:XI50.3.6.1.5 –Build:156262

•Client:HighperformancegenericHTTPLinuxclient•Server:Standardlow-latencyHTTPwebserver –4XQuadcore:GenuineIntel®[email protected] –Memory:16G –RedHatLinuxAS4

•DefaultsettingswereusedforbothIntel® SOA ExpresswayandtheIBM*DataPowerXI50

•Intel®SOAExpresswayv2.7onIntelXeonE5 –DualProcessorIntel® Xeon®[email protected] –Manufacturer:Supermicro –Memory:4x8GB(12Gtotal) –OperatingSystem:RedHatLinuxAS-5,64-bit

Performance Test Details

Intel® Expressway Services Gateway Performance Comparison