Post on 25-Feb-2019
Abstract
This paper describes performance test results for running Kafka with Dell
EMC Isilon F800 All-Flash NAS Storage. A comparison against direct
attached storage is also provided.
20+ MILLION RECORDS A SECOND
Running Kafka with Dell EMC Isilon All Flash F800 Scale-out NAS
Author: Boni Bruno, CISSP, CISM, CGEIT
Chief Solutions Architect, Dell EMC
Authotr
Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS
Copyright © August 2018 Dell Inc. or its subsidiaries. All rights reserved.
Dell believes the information in this publication is accurate as of its publication date. The
information is subject to change without notice.
THE INFORMATION IN THIS PUBLICATION IS PROVIDED “AS-IS.“ DELL MAKES NO REPRESENTATIONS
OR WARRANTIES OF ANY KIND WITH RESPECT TO THE INFORMATION IN THIS PUBLICATION, AND
SPECIFICALLY DISCLAIMS IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A
PARTICULAR PURPOSE. USE, COPYING, AND DISTRIBUTION OF ANY DELL SOFTWARE DESCRIBED IN
THIS PUBLICATION REQUIRES AN APPLICABLE SOFTWARE LICENSE.
Dell, EMC, and other trademarks are trademarks of Dell Inc. or its subsidiaries. Other trademarks
may be the property of their respective owners.
EMC Corporation
Hopkinton, Massachusetts 01748-9103
1-508-435-1000 In North America 1-866-464-7381
www.EMC.com
Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS
Contents
Overview .......................................................................................................................................................................... 4
Kafka Introduction ......................................................................................................................................................... 5
Dell EMC Isilon F800 All-Flash NAS ................................................................................................................................ 8
Performance Test Environment .................................................................................................................................. 11
Performance Test Results ............................................................................................................................................ 12
Test 1 - Single producer (1 Broker), no replication, 50M records, 100 Byte record size ........................... 12
Test 2 - Single producer (5 brokers), 3x asynchronous replication, 50M records, 100 Byte record size 13
Test 3 - Single producer (5 brokers), 3x synchronous replication, 50M records, 100 Byte record size .. 15
Test 4 - 5 producers, no replication, 250M records, 100 Byte record size .................................................. 16
Test 5 - 5 producers, 3x asynchronous replication, 250M records, 100 Byte record size ......................... 17
Test 6 – Effect of Record Size on Producer Throughput ................................................................................ 19
Test 7 - Single consumer, 50M records, 100 Byte record size ........................................................................ 21
Test 8 - 5 consumers, 250M records, 100 Byte record size ............................................................................. 22
Test 9 – 1 producer & 1 consumers, 50M records written, 50M records read ........................................... 23
Test 10 – Stress testing Isilon F800 All-Flash Scale-out NAS ............................................................................. 24
Conclusions ................................................................................................................................................................... 27
Appendix ....................................................................................................................................................................... 28
Kafka Server Properties ........................................................................................................................................ 29
Zookeeper Properties ........................................................................................................................................... 30
Producer Properties .............................................................................................................................................. 30
Consumer Properties ............................................................................................................................................ 30
NFS Client Configuration ..................................................................................................................................... 31
Isilon Configuration ............................................................................................................................................... 33
OneFS TCP Tuning ................................................................................................................................................. 37
Kafka End-to-End Latency Test .......................................................................................................................... 39
Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS
Overview
Kafka is a distributed, horizontally-scalable, fault-tolerant, stream processing system being used in many
enterprises. Kafka is a system that lets you publish and subscribe to streams of data, it also stores and
process the data. It is now a part of the Apache Software Foundation with a commercial version
available through Confluent that includes Kafka software enhancements and Enterprise level of
support.
Kafka runs as a cluster and can scale to handle millions of records a second. This paper covers Kafka
performance test results for a Kafka DAS (Direct Attached Storage) cluster using PowerEdge R730XD
servers and a Kafka Isilon F800 NAS (Network Attached Storage) cluster using the same servers.
As you read through this paper, you will see that Dell EMC Isilon F800 Scale-out NAS solution provides
excellent performance and better storage utilization with less drives and a smaller storage foot print
compared to DAS. This is great news for customers looking to simplify their Kafka cluster deployments
and improve storage efficiency.
Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS
Kafka Introduction
A Kafka cluster consists of Producers that send records to the cluster, the cluster stores these records and
makes them available to Consumers. A general Kafka cluster diagram is shown below for reference.
A key concept to understand with Kafka is what is known as a Topic. Producers publish their records to
a specific topic and consumers can subscribe to one or more of these topics. A Kafka topic is just a
partitioned write-ahead log. Producers append records to these logs and consumers simply subscribe
to the changes. The records consist of a key/value pair. The key is used for assigning the records to a
log partition. Below is an example of a topic with four partitions with writes being appended to the end
of each partition.
Partitions also provide redundancy and scalability. Each partition can be hosted on a different server
allowing a single topic to be scaled horizontally to increases cluster performance. The term stream in
Kafka is a single topic of data regardless of the number of partitions.
Consumers work as part of a consumer group where one or more consumers work together to consume
a topic. Each partition is only consumed by one member of the consumer group. Below is an example
Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS
of three consumers in a single group consuming a single topic. Here two consumers are working from
one partition and the third consumer is working from two partitions.
Consumers can scale to consume topics with large number of messages. If a single consumer fails, the
remaining members of the group will rebalance the partitions being consumed to take over for the
failed consumer.
A single Kafka server if called a Broker. Brokers receive messages from producers, assigns offsets to
them, and commits the messages to disk. Brokers also service requests from consumers. Brokers are part
of the Kafka Cluster, only one broker will be elected as the cluster controller to assign partitions to
brokers and detect broker failures. A partition is owned by a single broker, this broker becomes the
leader of the partition. If a partition is assigned to multiple brokers, the partition will be replicated to
provide better redundancy if a broker were to fail.
The diagram below shows replication of partitions in Kafka.
Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS
Kafka allows retention policies to be configured where Kafka Brokers retain messages for some period of
time or until a topic reaches a certain size in bytes. Once these limits are reached, messages are
expired and deleted.
Storage management gets complicated with Kafka as retention requirements increase or as the Kafka
cluster itself increases in size. Centralizing storage with Dell EMC Isilon can greatly simplify storage
management issues with Kafka as more space is needed, with Isilon you simply add more Isilon nodes to
the backend and capacity is instantly available with no need to change any of the Kafka configuration
except for maybe increasing the retention policy.
The next section covers Dell EMC Isilon in more detail.
Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS
Dell EMC Isilon F800 All-Flash NAS
Dell EMC Isilon F800 all-flash scale-out NAS storage provides up to 250,000 IOPS and 15 GB/s bandwidth
per chassis. With a choice of SSD drive capacities, all-flash storage ranges from 96 TB to 924 TB per
chassis making the Isilon F800 ideal for demanding storage requirements in high volume messaging
systems like Kafka.
In additional to all-flash high-performance scale-out hardware design of the Isilon F800, the embedded
storage operating system (Isilon OneFS) provides a unifying clustered file system with built-in scalable
data protection that simplifies storage management and administration. OneFS is a fully symmetric file
system with no single point of failure — taking advantage of clustering not just to scale performance
and capacity, but also to allow for any-to-any failover and multiple levels of redundancy that go far
beyond the capabilities of RAID.
OneFS allows hardware to be incorporated or removed from the cluster at will and at any time,
abstracting the data and applications away from the hardware. Data is given infinite longevity and the
cost and pain of data migrations and hardware refreshes are eliminated.
Isilon nodes
OneFS works exclusively with the Isilon scale-out NAS nodes, referred to as a “cluster”. A single Isilon
cluster consists of multiple nodes, which are rack-mountable enterprise appliances containing: memory,
CPU, networking, Ethernet or low-latency InfiniBand interconnects, disk controllers and storage media.
As such, each node in the distributed cluster has compute as well as storage capabilities.
With the new generation of Isilon hardware (“Gen6”), a single chassis of 4 nodes in a 4U form factor is
required to create a cluster, which currently scales up to 144-nodes. Previous Isilon hardware platforms
need a minimum of three nodes and 6U of rack space to form a cluster. There are several different
types of nodes, all of which can be incorporated into a single cluster, where different nodes provide
varying ratios of capacity to throughput or Input/Output operations per second (IOPS).
Each node or chassis added to a cluster increases aggregate disk, cache, CPU, and network capacity.
OneFS leverages each of the hardware building blocks, so that the whole becomes greater than the
sum of the parts. The RAM is grouped together into a single coherent cache, allowing I/O on any part of
the cluster to benefit from data cached anywhere. A file system journal ensures that writes are safe
across power failures. Spindles and CPU are combined to increase throughput, capacity and IOPS as
the cluster grows, for access to one file or for multiple files. A cluster’s storage capacity can range from
a minimum of 18 terabytes (TB) to a maximum of greater than 68 petabytes (PB). The maximum
capacity will continue to increase as disk drives and node chassis continue to get denser.
Isilon nodes are broken into several classes, or tiers, according to their functionality:
Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS
This paper focuses on the F800 node type for Kafka. A good alternative to the F800 is the H600 node
type if storage capacity requirements are lower.
Network
There are two types of networks associated with a cluster: internal and external.
Back-end (internal) network
All intra-node communication in a cluster is performed across a dedicated backend network,
comprising either 10 or 40 GbE Ethernet, or low-latency QDR InfiniBand (IB). This back-end network,
which is configured with redundant switches for high availability, acts as the backplane for the cluster.
This enables each node to act as a contributor in the cluster and isolating node-to-node
communication to a private, high-speed, low-latency network. This back-end network utilizes Internet
Protocol (IP) for node-to-node communication.
Front-end (external) network
Clients connect to the cluster using Ethernet connections (1GbE, 10GbE or 40GbE) that are available on
all nodes. Because each node provides its own Ethernet ports, the amount of network bandwidth
available to the cluster scales linearly with performance and capacity. The Isilon cluster supports
standard network communication protocols to a customer network, including NFS, SMB, HTTP, FTP, HDFS,
and OpenStack Swift. Additionally, OneFS provides full integration with both IPv4 and IPv6 environments.
The Kafka Isilon F800 cluster tested in this paper uses NFS v3 as the network communication protocol.
Complete cluster view
Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS
The complete cluster is combined with hardware, software, networks in the following view:
File system structure
The OneFS file system is based on the UNIX file system (UFS) and, hence, is a very fast distributed file
system. Each cluster creates a single namespace and file system. This means that the file system is
distributed across all nodes in the cluster and is accessible by clients connecting to any node in the
cluster. There is no partitioning, and no need for volume creation.
Because all information is shared among nodes across the internal network, data can be written to or
read from any node, thus optimizing performance when multiple users or applications are concurrently
reading and writing to the same set of data.
For more details on Isilon and OneFS please see Isilon Technical Overview.
Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS
Performance Test Environment
Kafka
The Kafka version tested for this paper is version 2.12-1.1.0. Kafka configuration (including zookeeper)
properties are shown in the Appendix.
Compute Nodes
All the compute nodes are identical Dell PowerEdge R730xd servers with 40 cores, 256G RAM, 25 x 1.1 TB
SAS disks (directly mounted JBOD, no RAID), and 10G NIC running CentOS Linux release 7.4.1708 (Core).
Up to 12 x PowerEdge R730xd servers were used for various test scenarios that are described in detail in
the Performance Test Results section of this paper. A total of 5 zookeeper servers are configured in the
test environment and run on the first five Kafka servers – k0, k1, k2, k3, and k4. The remaining Kafka
servers are named k5 – k11.
Isilon F800 (NFS Mounted from each Kafka compute node)
A single Isilon F800 Chassis with 60 x 1.6TB SSD drives available for Kafka-Isilon testing.
The specific Isilon Model tested: Isilon F800-4U-Single-256GB-1x1GE-2x40GE SFP+-24TB SSD
The Isilon OneFS release tested: OneFS v 8.1.0.4. NFS configuration details are listed in the Appendix.
Kafka Clusters
Two Kafka clusters were tested - a DAS cluster with 300 x SAS drives vs a single 4U Isilon F800 cluster with
only 60 drives.
The DAS cluster strictly uses PowerEdge R730xd servers for both compute and storage, a total of 12
PowerEdge servers with 300 SAS disks were available (~ 300 TB capacity) for the DAS Kafka cluster. All
the compute nodes were connected 10GbE with jumbo frames (MTU 9014) enable.
The Isilon cluster uses a single 4U Isilon F800 for all Kafka storage with the PowerEdge R730xd servers used
for compute only. Each Kafka server NFS mounts a corresponding kafka-logs directory from Isilon.
Each server has a unique mount point. Details on the NFS setup is shown in the Appendix. Only the OS
drive and an additional 1.1 TB drive was used on each PowerEdge server for the Isilon cluster. The Isilon
F800 connected to the 10GigE PowerEdge environment over 40GbE ports on the core switch with jumbo
frames (MTU 9000) enabled.
Note: Only half the 40GbE front-end ports available on Isilon were connected due to the lack of
available 40GbE ports on the core switch during testing.
Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS
Performance Test Results
Producer Throughput Tests
The producer throughput tests stress the throughput of the producer on each cluster (DAS & Isilon). No
consumers are run during these tests so all messages are persisted but not read. Results below show the
average of three test runs.
Note: The optimum batch size is used for each cluster, this is determined by various test runs for each
cluster and seeing what yields the best performance result.
Test 1 - Single producer (1 Broker), no replication, 50M records, 100 Byte record size
Producer Test 1 Setup:
DAS:
kafka-topics.sh –-zookeeper k0:2181 –-create –-topic DASr1 –-partitions 8 –-replication factor 1
ISILON:
kafka-topics.sh --zookeeper k0:2181 --create --topic F800r1 --partitions 8 --replication-factor 1
Producer Test 1 Commands:
DAS: kafka-run-class.sh org.apache.kafka.tools.ProducerPerformance --topic DASr1 --num-records 50000000 --
throughput -1 --record-size 100 --producer-props acks=1 bootstrap.servers=k0:9092 buffer.memory=67108864
batch.size=8196
ISILON:
bin/kafka-run-class.sh org.apache.kafka.tools.ProducerPerformance --topic rep1 --num-records 50000000 --
throughput -1 --record-size 100 --producer-props acks=1 bootstrap.servers=k0:9092 buffer.memory=67108864
batch.size=194196
DAS Producer Throughput Result: 50,000,000 records sent, 1,231,861 records/sec (117 MB/sec)
25ms average latency, 42ms 95th percentile latency
ISILON Producer Throughput Result: 50,000,000 records sent, 1,401,424 records/sec (134 MB/sec)
Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS
7ms average latency, 7ms 95th percentile latency
Test 2 - Single producer (5 brokers), 3x asynchronous replication, 50M records, 100 Byte record size
Test 2 is exactly the same as the previous one except that now each partition has three replicas (so the
total data written to cluster is three times greater). Each server is doing both writes from the producer for
the partitions for which it is a master, as well as fetching and writing data for the partitions for which it is
a follower.
Replication in this test is asynchronous. That is, the server acknowledges the write as soon as it has
written it to its local log without waiting for the other replicas to also acknowledge it. This means, if the
master were to crash, it would likely lose the last few messages that had been written but not yet
replicated. This makes the message acknowledgement latency a little better at the cost of some risk in
the case of server failure.
When using a JBOD configuration on DAS, replication is important to increase redundancy, however the
total cluster write capacity is 3x less with 3x replication (since each write is done three times). Isilon uses
erasure coding to increase redundancy and improve storage efficiency (near 80% efficiency). Also, the
Isilon high-speed interconnect provides data accessibility across all nodes, even when there is a node
failure on Isilon, data can be retrieved from the remaining nodes without any re-elections on Kafka.
Producer Test 2 Setup:
DAS:
kafka-topics.sh –-zookeeper k0:2181,k1:2181,k2:2181,k3:2181,k4:2181 –-create –-topic DASr3 –-partitions
8 –-replication factor 3
ISILON:
kafka-topics.sh --zookeeper k0:2181,k1:2181,k2:2181,k3:2181,k4:2181 --create --topic F800r3 --partitions
8 --replication-factor 3
Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS
Producer Test 2 Commands:
DAS: kafka-run-class.sh org.apache.kafka.tools.ProducerPerformance --topic DASr3 --num-records 50000000 --
throughput -1 --record-size 100 --producer-props acks=1 bootstrap.servers=k0:9092 buffer.memory=67108864
batch.size=8196
ISILON:
kafka-run-class.sh org.apache.kafka.tools.ProducerPerformance --topic F800r3 --num-records 50000000 --
throughput -1 --record-size 100 --producer-props acks=1 bootstrap.servers=k0:9092 buffer.memory=67108864
batch.size=194196
DAS Producer Throughput Result: 50,000,000 records sent, 1,217,777 records/sec (116 MB/sec)
40ms average latency, 62ms 95th percentile latency
ISILON Producer Throughput Result: 50,000,000 records sent, 1,224,380 records/sec (117 MB/sec)
17ms average latency, 27ms 95th percentile latency
Asynchronous replication (acks=1) does decrease throughput, increases latency, and uses up more
storage space. When using Isilon, 3x data replication is not needed based on the redundancy and
efficiency built into Isilon/OneFS, but it is recommended for DAS Kafka cluster deployments.
For Isilon, a replication of 1 is fine and will offer the best throughput and storage efficiency, use a
replication of 2 to provide better Kafka server fault tolerance if Kafka compute node failure is a concern
when deploying Kafka with Isilon.
Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS
Test 3 - Single producer (5 brokers), 3x synchronous replication, 50M records, 100 Byte record size
Test 3 is the same as Test 2 except that now the master for a partition waits for acknowledgement from
the full set of in-sync replicas before acknowledging back to the producer. With synchronous
replication, Kafka ensures that messages will not be lost as long as one in-sync replica remains.
Synchronous replication (acks = -1) in Kafka is not fundamentally very different from asynchronous
replication. The leader for a partition always tracks the progress of the follower replicas, Kafka will not
send out messages to consumers until they are fully acknowledged by replicas. With synchronous
replication Kafka waits to respond to the producer request until the followers have replicated it.
Producer Test 3 Commands:
DAS: kafka-run-class.sh org.apache.kafka.tools.ProducerPerformance --topic DASr3 --num-records 50000000 --
throughput -1 --record-size 100 --producer-props acks=-1 bootstrap.servers=k0:9092
buffer.memory=67108864 batch.size=8196
ISILON:
kafka-run-class.sh org.apache.kafka.tools.ProducerPerformance --topic F800r3 --num-records 50000000 --
throughput -1 --record-size 100 --producer-props acks=-1 bootstrap.servers=k0:9092
buffer.memory=67108864 batch.size=194196
DAS Producer Throughput Result: 50,000,000 records sent, 269,879 records/sec (26 MB/sec)
2096ms average latency, 6850ms 95th percentile latency
ISILON Producer Throughput Result: 50,000,000 records sent, 1,046,703records/sec (100 MB/sec)
33ms average latency, 43ms 95th percentile latency
Synchronous replication (acks = -1) does decrease throughput significantly and also introduces
significant latency on the DAS cluster. In all three single producer test cases, the throughput and
latency results were better with the Isilon NAS cluster even though Isilon had less disks.
Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS
Test 4 - 5 producers, no replication, 250M records, 100 Byte record size
Test 4 is the same as Test1 except now we have increase the number of producers to 5 and the
generated record load to 250M records for each DAS and Isilon NAS Kafka cluster.
Producer Test 4 Setup:
DAS:
kafka-topics.sh –-zookeeper k0:2181,k1:2181,k2:2181,k3:2181,k4:2181 –-create –-topic 5DASr1 –-partitions
8 –-replication factor 1
ISILON:
kafka-topics.sh --zookeeper k0:2181,k1:2181,k2:2181,k3:2181,k4:2181 --create --topic 5F800r1 --
partitions 8 --replication-factor 1
Producer Test 4 Commands:
Note: The commands are run simultaneously on each of the 5 Kafka producer servers (k0-k5), the only
change to each command is with the bootstrap.servers parameter, the local producer is referenced
with each command, i.e. server k0 used bootstrap.servers k0, k1 uses k1, etc. This has nothing to do with
where partitions are written to, this just tells Kafka where to pull the bootstrap information from.
DAS: kafka-run-class.sh org.apache.kafka.tools.ProducerPerformance --topic 5DASr1 --num-records 50000000 --
throughput -1 --record-size 100 --producer-props acks=1 bootstrap.servers=k0:9092 buffer.memory=67108864
batch.size=8196
ISILON:
kafka-run-class.sh org.apache.kafka.tools.ProducerPerformance --topic 5F800r1 --num-records 50000000 --
throughput -1 --record-size 100 --producer-props acks=1 bootstrap.servers=k0:9092 buffer.memory=55108864
batch.size=194196
DAS Producer Throughput Result: 250,000,000 records sent, 4,428,438 records/sec (422 MB/sec)
592ms average latency, 2502ms 95th percentile latency
ISILON Producer Throughput Result: 250,000,000 records sent, 5,045,512 records/sec (481 MB/sec)
105ms average latency, 400ms 95th percentile latency
Note: The records/sec and MB/sec results shown represent the aggregate sum across the 5 producers.
The latency results shown are from the slowest producer in the.
Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS
Test 5 - 5 producers, 3x asynchronous replication, 250M records, 100 Byte record size
Test 5 is the same as Test4 except now 3x asynchronous data replication is configured. This is not a
needed (definitely not recommended) configuration for Isilon. This configuration only makes sense
when using a Kafka DAS cluster. Isilon provides much better data redundancy than DAS, however Isilon
does not provide protection against a Kafka broker failure. Use 2x asynchronous replication to protect
against Kafka broker failures when using Kafka with Isilon, otherwise stick to a replication factor of 1
when using Isilon with Kafka to ensure optimum performance and storage efficiency.
Producer Test 5 Setup:
DAS:
kafka-topics.sh –-zookeeper k0:2181,k1:2181,k2:2181,k3:2181,k4:2181 –-create –-topic 5DASr3 –-partitions
8 –-replication factor 3
ISILON:
kafka-topics.sh --zookeeper k0:2181,k1:2181,k2:2181,k3:2181,k4:2181 --create --topic 5F800r3 --
partitions 8 --replication-factor 3
Producer Test 5 Commands:
DAS: kafka-run-class.sh org.apache.kafka.tools.ProducerPerformance --topic 5DASr3 --num-records 50000000 --
throughput -1 --record-size 100 --producer-props acks=1 bootstrap.servers=k0:9092 buffer.memory=67108864
batch.size=8196
ISILON:
kafka-run-class.sh org.apache.kafka.tools.ProducerPerformance --topic 5F800r5 --num-records 50000000 --
throughput -1 --record-size 100 --producer-props acks=1 bootstrap.servers=k0:9092 buffer.memory=55108864
batch.size=194196
DAS Producer Throughput Result: 250,000,000 records sent, 5,729,978 records/sec (546 MB/sec)
98ms average latency, 590ms 95th percentile latency
ISILON Producer Throughput Result: 250,000,000 records sent, 3,838,539 records/sec (367 MB/sec)
186ms average latency, 1,081ms 95th percentile latency
Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS
As expected, 3x asynchronous replication increases latency on both clusters. A 3x replication factor
configuration with Isilon will put both the original data and replicated data all on Isilon since each Kafka
server is NFS mounting the associated kafka-logs directory from Isilon (see NFS configuration in Appendix
for details). This configuration unnecessarily adds a lot of network traffic and I/O requests to the Isilon
cluster which should be avoided. Alternatively, the DAS cluster only replicates data to a select number
of servers. Isilon does not need this amount of replication to provide data redundancy. Stick to
replication factor of 1 or at most a replication factor of 2 when using Isilon NAS with Kafka.
Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS
Test 6 – Effect of Record Size on Producer Throughput
So far all producer tests have used a small record size of 100 Bytes, this is not representative of most
production deployments of Kafka, but is a good size to use for stress testing purposes.
To get a better idea of individual producer throughput performance, different record sizes from 10 Bytes
to 100,000 Bytes were tested to see the effect on producer throughput.
Note: Producer throughput can be measured in two ways – number of records processed per second
or the byte throughput per second (MB/sec). Both results are shown below for reference.
1516203
1281251
217367
301663272
1585389
1294378
18032332912
3836
0
200000
400000
600000
800000
1000000
1200000
1400000
1600000
1800000
10 100 1000 10000 100000
Rec
ord
s/se
c
Record Size (Bytes)
Impact of Record Size on Throughput (Records/sec)
DAS
F800
Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS
The graphs above show that the number of records Kafka can send per second decreases as the
records get larger is size. If we look at MB/second results, the total byte throughput increases as
messages get bigger. It is important to understand the typical record size in your environment so you
can size your Kafka cluster accordingly to meet your throughput requirements.
The results shown above are specific to using PowerEdge R730XD servers with the stated specifications,
different compute specifications will have different throughput results. Test your specific compute
model to get a good understanding of the Kafka throughput capabilities for your particular compute
nodes.
14
122
207
288 312
15
123
172
314
366
0
50
100
150
200
250
300
350
400
10 100 1000 10000 100000
MB
/sec
Record Size (Bytes)
Impact of Record Size on Throughput (MB/sec)
DAS F800
Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS
Consumer Throughput Tests
So far only producer testing has been conducted, producer tests provide insights on the write
capabilities of Kafka. As shown in the Kafka producer tests above, both the DAS and Isilon clusters can
write over 5M records a second with only 5 producers.
A key value proposition to note so far is that Isilon is providing very good performance with less spindles,
less power, and a smaller storage foot print. The 5 producers tested ran on 2U PowerEdge R730xd
servers with 25 drives each, thus the 5 node producer DAS Kafka cluster tested has a total of 125 disks
and takes 10U of rack space. Adhering to Kafka best practices for DAS clusters and using a replication
factor of 3, the approximate useable storage for the DAS cluster is only 42 TB (125 TB / 3).
On the other hand, Isilon offers similar performance with half the number of disk drives and uses only 4U
of rack space and provides over 80 TB of usable storage! This is great news for Kafka administrators and
organizations looking to reduce the foot print of their Kafka clusters without sacrificing performance.
Now let’s look at the read capabilities of Kafka by running various consumer throughput tests against
both DAS and Isilon NAS Kafka clusters. Note that the replication factor will not affect the outcome of
the consumer tests as the consumer only reads from one replica regardless of the replication factor.
Likewise, the acknowledgement level of the producer also doesn't matter as the consumer only ever
reads fully acknowledged messages.
Test 7 - Single consumer, 50M records, 100 Byte record size
Consumer Test 7 commands:
DAS:
kafka-consumer-perf-test.sh --zookeeper k0:2181 --messages 50000000 --topic DASr3 --threads 1
ISILON:
kafka-consumer-perf-test.sh --zookeeper k0:2181 --messages 50000000 --topic F800r3 --threads 1
DAS Consumer Throughput Result: 50,000,000 records read, 1,763,544 records/sec (168 MB/sec)
ISILON Consumer Throughput Result: 50,000,000 records read, 1,795,139 records/sec (171 MB/sec)
Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS
Test 8 - 5 consumers, 250M records, 100 Byte record size
Consumer Test 8 commands: (Executed on 5 Kafka consumers simultaneously)
DAS:
kafka-consumer-perf-test.sh --zookeeper k0:2181 --messages 50000000 --topic 5DASr3 --threads 1
ISILON:
kafka-consumer-perf-test.sh --zookeeper k0:2181 --messages 50000000 --topic 5F800r3 --threads 1
DAS Consumer Throughput Result: 50,000,000 records read, 8,428,965 records/sec (803 MB/sec)
ISILON Consumer Throughput Result: 50,000,000 records read, 8,743,123 records/sec (834 MB/sec)
Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS
Test 9 – 1 producer & 1 consumers, 50M records written, 50M records read
The performance tests conducted thus far covered just the Kafka producers and the Kafka consumers
running in isolation. A typical deployment of Kafka runs the producer and consumer together.
Technically, the performance tests above have been running both the producer and consumers
together as Kafka replication works by using the servers themselves as consumers.
For this test one producer and one consumer is run against an eight partition 3x replicated topic that
begins empty. The producer is using async replication. The throughput reported is the consumer
throughput.
Test 9 setup commands:
DAS:
kafka-topics.sh --zookeeper=k0:2181,k1:2181,k2:2181,k3:2181,k4:2181 --create --partitions 8 --
replication-factor 3 --topic r3DASnew
ISILON:
kafka-topics.sh --zookeeper=k0:2181,k1:2181,k2:2181,k3:2181,k4:2181 --create --partitions 8 --
replication-factor 3 --topic r3F800new
Test 9 test commands:
DAS Producer:
kafka-run-class.sh org.apache.kafka.tools.ProducerPerformance --topic r3DASnew --num-records 50000000 --
throughput -1 --record-size 100 --producer-props acks=1 bootstrap.servers=k1:9092 buffer.memory=67108864
batch.size=8196
DAS Consumer:
kafka-consumer-perf-test.sh --zookeeper k0:2181 --messages 50000000 --topic r3DASnew --threads 1
ISILON Producer:
kafka-run-class.sh org.apache.kafka.tools.ProducerPerformance --topic r3F800new --num-records 50000000 -
-throughput -1 --record-size 100 --producer-props acks=1 bootstrap.servers=k1:9092
buffer.memory=67108864 batch.size=194196
ISILON Consumer:
kafka-consumer-perf-test.sh --zookeeper k0:2181 --messages 50000000 --topic r3F800new --threads 1
DAS Consumer Throughput Result: 50,000,000 records read, 1,623,640 records/sec (155 MB/sec)
ISILON Consumer Throughput Result: 50,000,000 records read, 1,745,932 records/sec (166 MB/sec)
Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS
Test 10 – Stress testing Isilon F800 All-Flash Scale-out NAS
A single Isilon F800, as tested in this paper, only comes with 4 nodes. Isilon normally recommends having
one Isilon node for each compute node in a high performance distributed cluster. The previous results
already show that the Isilon F800 can easily handle a 5 node Kafka cluster.
This test will increase the compute count to 12 nodes to see how well a 4-node F800 Isilon cluster can
support 12 Kafka servers simultaneously generating 50M records then consuming 50M records from the
same Kafka Topic. This equates to generating and consuming 600M records in total with just a single
Isilon F800 chassis that has 4 nodes and 60 drives. Furthermore, three different record sizes will be tested,
10 Bytes, 100 Bytes, and 512 Bytes.
Test 10 Setup:
kafka-topics.sh --create --zookeeper=k0:2181,k1:2181,k2:2181,k3:2181,k4:2181 --topic 12nodes10 --
partitions 8 --replication-factor 1
kafka-topics.sh --create --zookeeper=k0:2181,k1:2181,k2:2181,k3:2181,k4:2181 --topic 12nodes100 --
partitions 8 --replication-factor 1
kafka-topics.sh --create --zookeeper=k0:2181,k1:2181,k2:2181,k3:2181,k4:2181 --topic 12nodes512 --
partitions 8 --replication-factor 1
Test 10 commands: (Executed on 12 Kafka server nodes simultaneously)
Producer commands: 10 Byte Test:
kafka-run-class.sh org.apache.kafka.tools.ProducerPerformance --topic 12nodes100 --num-records 50000000
--throughput -1 --record-size 10 --producer-props acks=1 bootstrap.servers=k0:9092
buffer.memory=55108864 batch.size=194196
100 Byte Test:
kafka-run-class.sh org.apache.kafka.tools.ProducerPerformance --topic 12nodes100 --num-records 50000000
--throughput -1 --record-size 100 --producer-props acks=1 bootstrap.servers=k0:9092
buffer.memory=55108864 batch.size=194196
512 Byte Test:
kafka-run-class.sh org.apache.kafka.tools.ProducerPerformance --topic 12nodes100 --num-records 50000000
--throughput -1 --record-size 512 --producer-props acks=1 bootstrap.servers=k0:9092
buffer.memory=55108864 batch.size=194196
Consumer commands:
10 Byte Test:
kafka-consumer-perf-test.sh --zookeeper k0:2181 --messages 50000000 --topic 12nodes10 --threads 1
100 Byte Test:
kafka-consumer-perf-test.sh --zookeeper k0:2181 --messages 50000000 --topic 12nodes100 --threads 1
512 Byte Test:
kafka-consumer-perf-test.sh --zookeeper k0:2181 --messages 50000000 --topic 12nodes512 --threads 1
Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS
Test 10 Results:
Producer results:
ISILON 10B Throughput Result: 600,000,000 records, 17,719,448 records/sec (168 MB/sec)
ISILON 100B Throughput Result: 600,000,000 records, 12,161,606 records/sec (1,158 MB/sec)
ISILON 512B Throughput Result: 600,000,000 records, 4,335,062 records/sec (2,115 MB/sec)
Consumer results:
ISILON 10B Throughput Result: 600,000,000 records read, 19,141,011 records/sec (1,825 MB/sec)
ISILON 100B Throughput Result: 600,000,000 records read, 20,030,055 records/sec (1,911 MB/sec)
ISILON 512B Throughput Result: 600,000,000 records read, 10,607,182 records/sec (5,179 MB/sec)
A single Isilon F800 performed very well under high load. The Kafka results show that producers can write
over 17+ million records a second and consumers can read over 20+ million records a second with a
single F800 Isilon chassis.
All the performance results shown thus far are directly from Kafka. We can also get stats directly from
Isilon while under load to see the utilization of the individual Isilon nodes, network throughput rates, disk
throughput rates, cpu utilization, active number of clients, etc.
Below are Isilon specific performance reporting charts during the 10 Byte, 100 Byte, and 512 Byte tests
with 12 Kafka servers.
Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS
Isilon Performance Report during 10 Byte Record Size Test
Isilon Performance Report during 100 Byte Record Size
Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS
Isilon Performance Report during 512 Byte Record Size
You can see from the Isilon performance reports above the all four nodes show even load distribution
for network and disk throughput rates. Load balancing network connections and disk I/O is a key value
proposition Isilon provides and it’s automatic. All the Kafka nodes just NFS mount Isilon by its FQDN (fully
qualified domain name) and OneFS transparently handles all the load balancing for you.
Also note that the CPU utilization stayed low during all test runs. This means the F800 has resources to
spare while the stress test was running, this is a testament to the engineering work that went into the
product.
Conclusions
This paper shows that a single Isilon F800 performed very well under high load with Kafka. The Kafka
results show that producers can write over 17+ million records a second and consumers can read over
20+ million records a second using a single Isilon F800 NAS system. During all performance tests, the
CPU utilization across the entire Isilon cluster stayed very low.
The Isilon F800 Scale-out NAS storage system performed just as well as a Kafka DAS (direct attached
storage) cluster that had 5x the number of disks. The embedded erasure code design with Isilon OneFS
also provides much better storage efficiency and data protection than Kafka DAS clusters that use a 3x
replication factor. This allows Kafka administrations to safely lower the Kafka replication factor and
increase storage capacity and efficiency with ease when using Isilon.
Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS
Appendix
Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS
Kafka Server Properties
Below is the running server property configuration file used for this paper, the only difference between
the Kafka DAS configuration and the Kafka Isilon configuration is what is highlighted below, everything
else is the same on both clusters.
broker.id=0
listeners=PLAINTEXT://k0:9092
num.network.threads=24
num.io.threads=8
socket.send.buffer.bytes=13107200
socket.receive.buffer.bytes=13107200
socket.request.max.bytes=104857600
# Just a single Isilon NFS mount needed with the Isilon config
log.dirs=/mnt/k0/kafka-logs
# Kafka DAS config has all direct attached disk drives (24) used, the remaining drive is for OS.
log.dirs=/data1/kafka/kafka-logs,/data2/kafka/kafka-logs,/data3/kafka/kafka-logs,/data4/kafka/kafka-
logs,/data5/kafka/kafka-logs,/data6/kafka/kafka-logs,/data7/kafka/kafka-logs,/data8/kafka/kafka-
logs,/data9/kafka/kafka-logs,/data10/kafka/kafka-logs,/data11/kafka/kafka-logs,/data12/kafka/kafka-
logs,/data13/kafka/kafka-logs,/data14/kafka/kafka-logs,/data15/kafka/kafka-logs,/data16/kafka/kafka-
logs,/data17/kafka/kafka-logs,/data18/kafka/kafka-logs,/data19/kafka/kafka-logs,/data21/kafka/kafka-logs
,/data21/kafka/kafka-logs,/data22/kafka/kafka-logs,/data23/kafka/kafka-logs,/data24/kafka/kafka-logs
num.partitions=8
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1
log.retention.hours=168
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000
zookeeper.connect=k0:2181,k1:2181,k2:2181,k3:2181,k4:2181
zookeeper.connection.timeout.ms=6000
group.initial.rebalance.delay.ms=0
delete.topic.enable=true
Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS
Zookeeper Properties
Below is the zookeeper properties file used for both Kafka DAS and Kafka Isilon clusters.
clientPort=2181
maxClientCnxns=0
server.0=k0:2888:3888
server.1=k1:2888:3888
server.2=k2:2888:3888
server.3=k3:2888:3888
server.4=k4:2888:3888
initLimit=5
syncLimit=2
Producer Properties
bootstrap.servers=k0:9092,k1:9092,k2:9092,k3:9092,k4:9092
compression.type=lz4
Consumer Properties
bootstrap.servers=k0:9092,k1:9092,k2:9092,k3:9092,k4:9092
group.id=test-consumer-group
Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS
NFS Client Configuration
The Kafka Isilon cluster uses NFS to mount the remote Isilon OneFS file system. Using NFS centralizes all
the Kafka data on Isilon, this provides Kafka administrators an easy way to immediately increase storage
capacity on the fly by simply adding more Isilon nodes. No need to add servers or modify any of the
Kafka configuration, as soon as new Isilon nodes are added to the Isilon cluster, the Kafka cluster will
immediately see an increase in storage capacity and performance.
Since NFS is being use as the protocol to network the Kafka servers to Isilon, it’s important to optimize the
NFS client settings to obtain the performance detailed in this paper. Below are the NFS mount options
used in /etc/fstab on each Kafka server for the Kafka Isilon cluster.
Note: The mount options are the same on each Kafka server, the only difference on each server is the
mount point itself. The below example is the configuration for Kafka server k0 only, which mounts /ifs/k0
from Isilon. Kafka server k1 mounts /ifs/k1 from Isilon and so on. Each Kafka server needs its own unique
NFS export from Isilon/OneFS. The Isilon configuration is describe in detail in the next section for
reference. A single export on Isilon could have been used as well, in this case, each Kafka server would
just mount a different sub directory.
Example Kafka Server K0 NFS Client Configuration:
/etc/fstab
isilon.example.com:/ifs/k0 /mnt/k0 nfs nolock, noacl, nocto, noatime, async, nodiratime,
nfsvers=3, tcp, rw, hard, intr, timeo=600, retrans=2, rsize=524288, wsize=524288 0 0
The isilon.example.com entry in /etc/fstab above corresponds to the OneFS SmartConnect Zone Name.
SmartConnect is what provides load balancing via DNS, so you must delegate this zone name to Isilon
on your DNS server to ensure a proper load balancing configuration for Kafka.
See the SmartConnect Whitepaper for further information.
A breakdown of what the NFS mount options above do are described below for reference. These
settings increase NFS performance.
Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS
async mode allows Isilon to reply to the NFS client as soon as it has processed the I/O request and sent
it to the local filesystem.
nfsvers=3 specifies NFS version 3 to be used.
noacl disables Access Control List (ACL) processing.
noatime option specifies that inode access times are not updated on the filesystem.
nocto option suppresses the retrieval of new attributes when creating a file.
nodiratime option specifies that the directory inode is not updated on the filesystem when it is
accessed.
nolock option prevents the exchange of file lock information between the NFS server and this NFS client.
retrans specifies the number of tries the NFS client will make to retransmit the packet.
rsize and wsize options specify the number of bytes per NFS read and write request.
rw option mounts the remote NFS file system in read/write mode.
tcp option specify NFS over TCP instead of UDP.
timeo option is the amount of time the NFS client waits on the NFS server before retransmitting a
packet.
Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS
Isilon Configuration
The following features were configured on the Isilon cluster. The Smart features shown below are
product differentiators that significantly enhance data storage performance and resiliency.
Enable SmartPools settings across all Isilon nodes
Enable SmartConnect to provide automatic client connection load balancing and failover
capabilities
Enable SmartCache for write performance and Streaming Access for Data Access Optimization
Use optimization for streaming data access pattern
Use a 40 Gb/s external network ports for NFS connections and internal 40 Gb/s ports for data
interconnect network
Increase network MTU to 9000 (Jumbo Frames) for both internal and external networks
The SmartCache and Streaming Access optimizations are easily enabled in the Isilon OneFS GUI through
a File Pool Policy tab. The Kafka Isilon cluster screen-shot is shown below for reference.
Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS
The Storage Pool is created on the SmartPools tab, this allows you to specify the Isilon nodes and
protection settings. Below is a screen-shot of the storage pool configured for the Kafka Isilon cluster.
The internal and external network configuration is set in the network configuration tab, here you specify
MTU size, IP info, and DNS server info.
Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS
The SmartConnect info is configured within the pool properties, in the Kafka Isilon cluster, the pool name
is pool0 and the SmartConnect info (IP and DNS info blacked-out) is shown in the screen-shot below:
Note: Isilon provides 2 x 40 GbE front-end and 2 x 40GbE back-end ports with each node, the Kafka
Isilon cluster was only configured with one 40 GbE front-end port on each node during performance
testing. With production deployments, use both front-end 40 GbE ports.
Lastly, the NFS exports are configured for each Kafka server, namely /ifs/k0 to /ifs/k11 as shown below:
Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS
The export settings for NFS v 3 are in the Global Settings tab:
Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS
OneFS TCP Tuning
The default TCP stack of OneFS needs some tuning for Kafka and 40GbE connectivity. The tuning needs
to be done within the CLI directly on Isilon. A tcptune.sh script is available at Github.
Simply run sh ./tcptune.sh Max to make the changes, an example script run is shown below:
Before changes:
isilon# sh ./tcptune.sh Max
Tuning TCP stack to Max
TCP sysctls before...
kern.ipc.maxsockbuf=2097152
net.inet.tcp.sendbuf_max=2097152
net.inet.tcp.recvbuf_max=2097152
net.inet.tcp.sendbuf_inc=8192
net.inet.tcp.recvbuf_inc=16384
net.inet.tcp.sendspace=131072
net.inet.tcp.recvspace=131072
efs.bam.coalescer.insert_hwm=209715200
efs.bam.coalescer.insert_lwm=178257920
After Changes: Apply tuning...
Value set successfully
Value set successfully
Value set successfully
Value set successfully
Value set successfully
Value set successfully
Value set successfully
Value set successfully
TCP sysctls after...
kern.ipc.maxsockbuf=104857600
net.inet.tcp.sendbuf_max=52428800
Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS
net.inet.tcp.recvbuf_max=52428800
net.inet.tcp.sendbuf_inc=16384
net.inet.tcp.recvbuf_inc=32768
net.inet.tcp.sendspace=26214400
net.inet.tcp.recvspace=26214400
efs.bam.coalescer.insert_hwm=209715200
efs.bam.coalescer.insert_lwm=178257920
net.inet.tcp.mssdflt=8948
That’s basically it for the Isilon configuration. With Dell EMC Isilon Scale-out NAS, you can now deploy
your own Kafka cluster and centrally store all your data while supporting millions of Kafka write and read
operations a second.
Apache Kafka Performance with Dell EMC Isilon F800 All-Flash NAS
Kafka End-to-End Latency Test
Latency information is provided in the results of some of the tests runs, however the Kafka results were
not end-to-end latency results.
Kafka provides a latency tool to test end-to-end latency between Kafka Producer and Kafka
Consumer. Below are the results of the end-to-end latency test for a single Producer and single
Consumer for both Kafka DAS and Kafka Isilon cluster. The 99.9th percentile latency is better with Isilon
F800. This tool does not provide throughput information. The test is for a 100 byte record size and 5000
records.
DAS bin/kafka-run-class.sh kafka.tools.EndToEndLatency k0:9092 DAS-latency 5000 all 100
0 204.818597
1000 2.255263
2000 1.697824
3000 1.760031
4000 1.704499
Avg latency: 2.7261 ms
Percentiles: 50th = 1, 99th = 36, 99.9th = 50
F800 bin/kafka-run-class.sh kafka.tools.EndToEndLatency k0:9092 F800-latency 5000 all 100
0 158.756064
1000 1.97554
2000 2.1549609999999997
3000 1.612731
4000 1.8153979999999998
Avg latency: 3.7892 ms
Percentiles: 50th = 2, 99th = 36, 99.9th = 41