Using XML for Distributed Computing XML-RPC and SOAP Mark Lewis 10-19-2000.
Modern Distributed Messaging and RPC
-
Upload
max-alexejev -
Category
Technology
-
view
108 -
download
1
description
Transcript of Modern Distributed Messaging and RPC
![Page 1: Modern Distributed Messaging and RPC](https://reader037.fdocuments.in/reader037/viewer/2022103111/54c6dda64a79597d0e8b45de/html5/thumbnails/1.jpg)
LIGHTWEIGHT MESSAGING AND RPC IN DISTRIBUTED
SYSTEMS
Max A. Alexejev11.10.2012
![Page 2: Modern Distributed Messaging and RPC](https://reader037.fdocuments.in/reader037/viewer/2022103111/54c6dda64a79597d0e8b45de/html5/thumbnails/2.jpg)
Some Theory to start with…
![Page 3: Modern Distributed Messaging and RPC](https://reader037.fdocuments.in/reader037/viewer/2022103111/54c6dda64a79597d0e8b45de/html5/thumbnails/3.jpg)
Messaging System
Message (not packet/byte/…) as a minimal transmission unit.The whole system unifies• Underlying protocol (TCP, UDP)• UNICAST or MULTICAST• Data format (message types & structure)Tied with• Serialization format (text or binary)
![Page 4: Modern Distributed Messaging and RPC](https://reader037.fdocuments.in/reader037/viewer/2022103111/54c6dda64a79597d0e8b45de/html5/thumbnails/4.jpg)
Typical peer-to-peer messaging
Producer[host, port]
Consumer[host, port]
![Page 5: Modern Distributed Messaging and RPC](https://reader037.fdocuments.in/reader037/viewer/2022103111/54c6dda64a79597d0e8b45de/html5/thumbnails/5.jpg)
Typical broker-based messaging
Producer[bhost, bport] Broker Consumer
[bhost, bport]
• Broker is an indirection layer between producer and consumer.
• Producer PUSHes messages to broker.• Consumer PULLs messages from broker.
![Page 6: Modern Distributed Messaging and RPC](https://reader037.fdocuments.in/reader037/viewer/2022103111/54c6dda64a79597d0e8b45de/html5/thumbnails/6.jpg)
The trick is…
Producer[bhost, bport] Broker Consumer
[bhost, bport]
• Producers and consumers are logical units.• Both P and C may be launched in multiple
instances.• p2p and pubsub terms are expressed in terms of
these logical (!) units.• Even broker may be distributed or replicated entity.
![Page 7: Modern Distributed Messaging and RPC](https://reader037.fdocuments.in/reader037/viewer/2022103111/54c6dda64a79597d0e8b45de/html5/thumbnails/7.jpg)
Generic SOA picture
S1
S2
S3
S4
S5
S6
![Page 8: Modern Distributed Messaging and RPC](https://reader037.fdocuments.in/reader037/viewer/2022103111/54c6dda64a79597d0e8b45de/html5/thumbnails/8.jpg)
In a generic case
• A service may be both a consumer for many producers and a producer to many consumers
![Page 9: Modern Distributed Messaging and RPC](https://reader037.fdocuments.in/reader037/viewer/2022103111/54c6dda64a79597d0e8b45de/html5/thumbnails/9.jpg)
Characteristics and Features• Topology (1-1, 1-N, N-N)• Retries• Service discovery• Guaranteed delivery (in case yes – at-least-once or exactly-once)• Ordering• Acknowledge• Disconnect detection• Transactions support (can participate in distributed transactions)• Persistence• Portability (one or many languages and platforms)• Distributed or not• Highly available or not• Type (p2p or broker-based)• Load balancing strategy for consumers• Client backoff strategy for producers• Tracing support• Library or standalone software
![Page 10: Modern Distributed Messaging and RPC](https://reader037.fdocuments.in/reader037/viewer/2022103111/54c6dda64a79597d0e8b45de/html5/thumbnails/10.jpg)
Main classes
• ESBs (Enterprise service buses)– Slow, but most feature-rich. MuleESB, JbossESB,
Apache Camel, many commercial.• JMS implementations– ActiveMQ, JBOSS Messaging, Glassfish, etc.
• AMQP implementations– RabbitMQ, Qpid, HornetQ, etc.
• Lightweight modern stuff - unstandardized– ZeroMQ, Finagle, Kafka, Beanstalkd, etc.
![Page 11: Modern Distributed Messaging and RPC](https://reader037.fdocuments.in/reader037/viewer/2022103111/54c6dda64a79597d0e8b45de/html5/thumbnails/11.jpg)
Messaging Performance
As usual, its about throughput and latency…Major throughput factors:– Network hardware used– UNICAST vs MULTICAST (for fan-out)Major latency factors:– Persistence (batched or single-message
persistence involves sequential or random disk writes)
– Transactions– Broker replication– Delivery guarantees (at-least-once & exactly-once)
![Page 12: Modern Distributed Messaging and RPC](https://reader037.fdocuments.in/reader037/viewer/2022103111/54c6dda64a79597d0e8b45de/html5/thumbnails/12.jpg)
Guaranteed delivery
Involves additional logic both on Producer, Consumer and Broker (if any)!
This is at-least-once delivery:• Producer needs to get ack’ed by Broker• Consumer needs to track high-watermark of
messages received from Broker
Exact-once delivery requires more work and even more expensive. Typically implemented as 2-phase commit.
![Page 13: Modern Distributed Messaging and RPC](https://reader037.fdocuments.in/reader037/viewer/2022103111/54c6dda64a79597d0e8b45de/html5/thumbnails/13.jpg)
Ordering (distributed broker scenario)
• Producers receive messages in any order. Very cheap.
No Ordering
• Messages are ordered within single data partition. Such as: stock symbol, account number, etc. Possible to create well-performing implementation of distributed broker.
Partitioned Ordering
• All incoming messages are fairly ordered. Scalability and performance is limited.
Global (fair) ordering
![Page 14: Modern Distributed Messaging and RPC](https://reader037.fdocuments.in/reader037/viewer/2022103111/54c6dda64a79597d0e8b45de/html5/thumbnails/14.jpg)
Remote procedure calls
Inherently builds on top of some messaging.Method call as a minimal unit (3 states: may succeed returning optional value, throw exception, or time out).
Adds some RPC-specific characteristics & features:• Sync or async• Distributed stack traces for exceptions• Interfaces and structs declaration (possibly, via some
DSL) – often come with serialization library• May support schema evolution
![Page 15: Modern Distributed Messaging and RPC](https://reader037.fdocuments.in/reader037/viewer/2022103111/54c6dda64a79597d0e8b45de/html5/thumbnails/15.jpg)
Serialization libraries
Currently, there are 4 clear winners:1. Google Protocol buffers (with ProtoStuff)2. Apache Thrift3. Avro4. MessagePack
All provide DSLs and schema evolution. Difference is in wire format and DSL compiler form (program in C, in Java, or does not require compilation).
![Page 16: Modern Distributed Messaging and RPC](https://reader037.fdocuments.in/reader037/viewer/2022103111/54c6dda64a79597d0e8b45de/html5/thumbnails/16.jpg)
Messaging vs RPCMessaging
• In Broker-enabled case: Producers are decoupled from Consumers. Just push message and don’t care who pulls it.
• Natively matches messages to events in event-sourcing architectures.
RPC• Need to know
destination (i.e., service A must know service B and call signature).
Messaging and RPC dictate different programming models. RPC requires higher coupling between interacting services.
![Page 17: Modern Distributed Messaging and RPC](https://reader037.fdocuments.in/reader037/viewer/2022103111/54c6dda64a79597d0e8b45de/html5/thumbnails/17.jpg)
And Practice to continue!
![Page 18: Modern Distributed Messaging and RPC](https://reader037.fdocuments.in/reader037/viewer/2022103111/54c6dda64a79597d0e8b45de/html5/thumbnails/18.jpg)
Today’s Overview
• Broker[less] peer-to-peer messaging
ZeroMQ
• Broker-enabled persistent distributed pubsub
Apache Kafka
• Multi-paradigm and feature-rich RPC in Scala
Twitter Finagle
![Page 19: Modern Distributed Messaging and RPC](https://reader037.fdocuments.in/reader037/viewer/2022103111/54c6dda64a79597d0e8b45de/html5/thumbnails/19.jpg)
ZeroMQ
“It's sockets on steroids. It's like mailboxes with routing. It's fast!Things just become simpler. Complexity goes away. It opens the mind. Others try to explain by comparison. It's smaller, simpler, but still looks familiar.”
@ ZeroMQ 2.2 Guide
![Page 20: Modern Distributed Messaging and RPC](https://reader037.fdocuments.in/reader037/viewer/2022103111/54c6dda64a79597d0e8b45de/html5/thumbnails/20.jpg)
ZeroMQ - features• Topology – all, very flexible.• Retries – no.• Service discovery – no.• Guaranteed delivery – no.• Acknowledge – no.• Disconnect detection – no.• Transactions support (can participate in distributed transactions) – no.• Persistence – kind of.• Portability (one or many languages and platforms) – yes, there are many bindings.
However, library itself is written in C, so there’s only one “native” binding.• Distributed – yes.• Highly available or not – no.• Type (p2p or broker-based) – mostly p2p. In case of N-N topology, a broker needed in form
of ZMQ “Device” with ROUTER/DEALER type sockets.• Load balancing strategy for consumers – yes (???).• Client backoff strategy for producers – no.• Tracing support – no.• Library or standalone software – platform-native library + language bindings.
![Page 21: Modern Distributed Messaging and RPC](https://reader037.fdocuments.in/reader037/viewer/2022103111/54c6dda64a79597d0e8b45de/html5/thumbnails/21.jpg)
ZeroMQ – features explained
Isn’t there too much “no”s ?
Yes and no. Most of the features are not provided out of the box, but may be implemented manually in client and\or server.Some features are easy to implement (heartbeats, ack’s, retries, …) some are very complex (guaranteed delivery, persistence, high availability).
![Page 22: Modern Distributed Messaging and RPC](https://reader037.fdocuments.in/reader037/viewer/2022103111/54c6dda64a79597d0e8b45de/html5/thumbnails/22.jpg)
ZeroMQ – what’s bad about it
• First of all – name.Think of ZMQ as a sockets library and u’re happy.Consider it messaging middleware and u got frustrated just while reading guide.
• Complex implementation for multithreaded clients and servers.
• There were issues with services going down due to corrupted packets (so, may not be suitable for WAN).
• Some mess with development process. Initial ZMQ developers forked ZMQ as Crossroads.io
![Page 23: Modern Distributed Messaging and RPC](https://reader037.fdocuments.in/reader037/viewer/2022103111/54c6dda64a79597d0e8b45de/html5/thumbnails/23.jpg)
ZeroMQ – what’s good
• Huge list of supported platforms.• MULTICAST support for fan-out (1-N)
topology.• High raw performance.• Fluent connect/disconnect/reconnect
behavior – really feels how it should be.• Wants to be part of Linux kernel.
![Page 24: Modern Distributed Messaging and RPC](https://reader037.fdocuments.in/reader037/viewer/2022103111/54c6dda64a79597d0e8b45de/html5/thumbnails/24.jpg)
ZeroMQ – verdict
• Good for non-reliable high performance communication, when delivery semantics is not strict. Example - ngx-zeromq module for NGINX.
• Good if you can invest sufficient effort in building custom messaging platform on top of ZMQ as a network library. Example – ZeroRPC lib by DotCloud.
• Bad for any other purpose.
![Page 25: Modern Distributed Messaging and RPC](https://reader037.fdocuments.in/reader037/viewer/2022103111/54c6dda64a79597d0e8b45de/html5/thumbnails/25.jpg)
Apache Kafka
“We have built a novel messaging system for log processing called Kafka that combines the benefits of traditional log aggregators and messaging systems. On the one hand, Kafka is distributed and scalable, and offers high throughput. On the other hand, Kafka provides an API similar to a messaging system and allows applications to consume log events in real time.”
@ Kafka: a Distributed Messaging System for Log Processing, LinkedIn
![Page 26: Modern Distributed Messaging and RPC](https://reader037.fdocuments.in/reader037/viewer/2022103111/54c6dda64a79597d0e8b45de/html5/thumbnails/26.jpg)
Kafka - features• Topology – all.• Retries – no.• Service discovery – yes (Zookeeper).• Guaranteed delivery – no (at-least-once in normal case).• Acknowledge – no.• Disconnect detection – yes (Zookeeper).• Transactions support (can participate in distributed transactions) – no.• Persistence – yes.• Portability (one or many languages and platforms) – no.• Distributed – yes.• Highly available or not – no (work in progress).• Type (p2p or broker-based) – broker-enabled with distributed broker.• Load balancing strategy for consumers – yes.• Client backoff strategy for producers – yes .• Tracing support – no.• Library or standalone software – standalone + client libraries in Java.
![Page 27: Modern Distributed Messaging and RPC](https://reader037.fdocuments.in/reader037/viewer/2022103111/54c6dda64a79597d0e8b45de/html5/thumbnails/27.jpg)
Kafka - Architecture
![Page 28: Modern Distributed Messaging and RPC](https://reader037.fdocuments.in/reader037/viewer/2022103111/54c6dda64a79597d0e8b45de/html5/thumbnails/28.jpg)
Kafka - Internals
• Fast writes– Configurable batching– All writes are continuous, no need for random disk access
(i.e., works well on commodity SATA/SAS disks in RAID arrays)• Fast reads– O(1) disk search– Extensive use of sendfile()– No in-memory data caching inside Kafka – fully relies on OS
file system’s page cache• Elastic horizontal scalability– Zookeeper is used for brokers and consumers discovery– Pubsub topics are distributed among brokers
![Page 29: Modern Distributed Messaging and RPC](https://reader037.fdocuments.in/reader037/viewer/2022103111/54c6dda64a79597d0e8b45de/html5/thumbnails/29.jpg)
Kafka - conclusion
• Good for event-sourcing architectures (especially when they add HA support for brokers).
• Good to decouple incoming stream and processing to withstand request spikes.
• Very good for logs aggregation and monitoring data collection.
• Bad for transactional messaging with rich delivery semantics (exact once etc).
![Page 30: Modern Distributed Messaging and RPC](https://reader037.fdocuments.in/reader037/viewer/2022103111/54c6dda64a79597d0e8b45de/html5/thumbnails/30.jpg)
Twitter Finagle
“Finagle is a protocol-agnostic, asynchronous RPC system for the JVM that makes it easy to build robust clients and servers in Java, Scala, or any JVM-hosted language.Finagle supports a wide variety of request/response- oriented RPC protocols and many classes of streaming protocols.”
@ Twitter Engineering Blog
![Page 31: Modern Distributed Messaging and RPC](https://reader037.fdocuments.in/reader037/viewer/2022103111/54c6dda64a79597d0e8b45de/html5/thumbnails/31.jpg)
Finagle - features• Topology – all, very flexible.• Retries – yes.• Service discovery – yes (Zookeper).• Guaranteed delivery – no.• Acknowledge – no.• Disconnect detection – yes.• Transactions support (can participate in distributed transactions) – no.• Persistence – no.• Portability (one or many languages and platforms) – JVM only.• Distributed – yes.• Highly available – yes.• Type (p2p or broker-based) – p2p.• Load balancing strategy for consumers – yes (least connections etc).• Client backoff strategy for producers – yes (limited exponential).• Tracing support – yes (Zipkin ).• Library or standalone software – Scala library.
![Page 32: Modern Distributed Messaging and RPC](https://reader037.fdocuments.in/reader037/viewer/2022103111/54c6dda64a79597d0e8b45de/html5/thumbnails/32.jpg)
Finagle – from authors
Finagle provides a robust implementation of:• connection pools, with throttling to avoid TCP connection churn;• failure detectors, to identify slow or crashed hosts;• failover strategies, to direct traffic away from unhealthy hosts;• load-balancers, including “least-connections” and other strategies; • back-pressure techniques, to defend servers against abusive clients
and dogpiling.
Additionally, Finagle makes it easier to build and deploy a service that• publishes standard statistics, logs, and exception reports;• supports distributed tracing (a la Dapper) across protocols;• optionally uses ZooKeeper for cluster management; and• supports common sharding strategies.
![Page 33: Modern Distributed Messaging and RPC](https://reader037.fdocuments.in/reader037/viewer/2022103111/54c6dda64a79597d0e8b45de/html5/thumbnails/33.jpg)
Finagle – Layered architecture
![Page 34: Modern Distributed Messaging and RPC](https://reader037.fdocuments.in/reader037/viewer/2022103111/54c6dda64a79597d0e8b45de/html5/thumbnails/34.jpg)
Finagle - Filters
![Page 35: Modern Distributed Messaging and RPC](https://reader037.fdocuments.in/reader037/viewer/2022103111/54c6dda64a79597d0e8b45de/html5/thumbnails/35.jpg)
Finagle – Event loop
![Page 36: Modern Distributed Messaging and RPC](https://reader037.fdocuments.in/reader037/viewer/2022103111/54c6dda64a79597d0e8b45de/html5/thumbnails/36.jpg)
Finagle – Future pools
![Page 37: Modern Distributed Messaging and RPC](https://reader037.fdocuments.in/reader037/viewer/2022103111/54c6dda64a79597d0e8b45de/html5/thumbnails/37.jpg)
Finagle - conclusion
• Good for complex JVM-based RPC architectures.
• Very good for Scala, worse experience with Java (but yes, they have some utility classes).
• Works well with Thrift and HTTP (plus trivial protocols), but lacks support for Protobuf and other popular stuff.
• Active developers community (Google group), but project infrastructure (maven repo, versioning, etc) still being improved.
![Page 38: Modern Distributed Messaging and RPC](https://reader037.fdocuments.in/reader037/viewer/2022103111/54c6dda64a79597d0e8b45de/html5/thumbnails/38.jpg)
Resources
• Moscow Big Systems / Big Data grouphttp://www.meetup.com/bigmoscow/
• http://www.zeromq.org• http://zerorpc.dotcloud.com• http://kafka.apache.org• http://twitter.github.io/finagle/
![Page 39: Modern Distributed Messaging and RPC](https://reader037.fdocuments.in/reader037/viewer/2022103111/54c6dda64a79597d0e8b45de/html5/thumbnails/39.jpg)
QUESTIONS?
AND CONTACTS HTTP://MAKSIMALEKSEEV.MOIKRUG.RU/ HTTP://RU.LINKEDIN.COM/PUB/MAX-ALEXEJEV/51/820/AB9 HTTP://WWW.SLIDESHARE.NET/MAXALEXEJEV [email protected] SKYPE: MALEXEJEV