Webhooks, Asynchronous Web Applications and Push Notifications

I am the Application Architect at Montage Talent. My responsibilities at Montage primarily include designing the next generation of the Montage video interviewing solution. When I began with Montage, the service was a simple web application with a single user interface. Since joining Montage I have led the effort to create a service oriented application to allow clients to integrate our services using a web API and to allow Montage to create additional web-based user interfaces including interfaces intended to be used from PCs and mobile devices.

I have 10 years of professional development experience, 6 of which were in consulting. Much of my experience has been in enterprise environments, and that experience is what I have taken with me to Montage as I continue to help us build out our service oriented architecture.

Lastly, I ultimately love my job because I get to help people solve problems, and solving a specific problem is the basis for this presentation.

2

Montage Talent is The Leader in Video Interviewing Technology. The service that Montage provides is the Montage Network which is a video interviewing solution sold as multiple applications; Montage Interview & Montage View.

Montage Interview is the application that we are using to live stream and record this presentation. Montage Interview is a live interviewing platform that allows for up to 16 live cameras to be connected in one live virtual interview room. Montage View is our recorded interview application. As a candidate, a recruiter may respond to your job inquiry asking you to compete a Montage View. During this process you will be prompted to answer a series of questions using your webcam, and your answers are then available for the hiring company to review.

Montage does more than just provide the technology to power video interviews. Our solution allows our clients to elevate their brands by demonstrating to their candidates that they understand the needs of today’s candidate and utilize the latest technology to make the recruiting process as quick and seamless for the candidate as possible.

Our solution also enhances the candidate experience by allowing our clients to reach out to candidates, easily allowing candidates to apply for a job in any location around the globe. Our solution is also highly beneficial to those candidates that can’t express all of their qualifications in a traditional resume. By allowing candidates to use video, candidates are able to demonstrate other soft skills on video, that may not translate

3

well to a written document.

3

Here is an overview of what tonight’s presentation will entail.

First I want to present what my intent is for speaking with you tonight and why I thought this would be a useful topic to share. Then I will present the material in a problem solution format discussing what caused me to stumble upon this solution and discuss how I implemented the solution. Lastly, there should be time at the end for any questions anyone might have or discussion around how the demonstrated patterns and protocols might be useful in your environment.

4

I am going to work through this presentation by first introducing the problem space that we faced in our business model that caused us to implement the technical solution that my presentation is based on.

The intent of my presentation is not to tell you “this is how you should solve this problem,” but rather use the problem space and my solution as a vehicle for a larger discussion around “the real-time web”, push notifications/events and webhooks. I have found the problem space to be a very common issue, but I had a difficulty finding a common pattern to solve it. By the end of my presentation I hope that you are able to identify this problem in your environment and are able to use concepts presented and discussed during this session to solve your problem.

5

While I was designing the new API services to be implemented at Montage, I quickly realized that Montage was unlike most other services I had worked with before. The workflow that we needed to support through our services is one that make take days to complete. Unlike the services I was used to working with, we are unable to give our API consumers immediate results in some cases. The ultimate result of a request may come back to the consumer days later, and during that time we can keep our consumers up-to-date on the status of the request, but it’s a far cry from immediate results.

Thus, we needed to develop a way to send status updates and the result to our consumers over this extended period of time and herein lies the problem.

Our services are exclusively RESTful, so implementing any sort of proprietary protocol or messaging was out of the question, especially since we don’t control our consumers. If this were just an internal service-to-service problem any off-the shelf messaging solution could have worked. But our messaging needed to support HTTP/S at a bare minimum. As a nice to have, if the messaging solution could also support other delivery protocols like SMTP, SMS, etc., that would be a plus.

Lastly, we needed to be sure that the solution we developed is reliable else our consumers would never know the status of their requests. This meant we had to either a) guarantee delivery b) provide a way to query for status updates.

6

I had in my mind that the solution I was looking for was essentially a post office. I wanted to be able to utilize some service in which I could drop off a message indicating delivery instruction, the payload and the receiver. This service would then do its best to deliver the message and if the receiver could not be reached, would fail and notify me of such failures for reconciliation.

Some of the solutions I looked into were various ESBs and Message queues.

• BizTalk

• RabbitMQ

However, at the time I wasn’t able to find a solution that was going to be able to deliver messages reliably over HTTP/S, so I started poking around looking for anything I could find to point me to a common solution to this problem.

I even asked a question about this on StackOverflow thinking that the thousands of developers out there would be able to help point me toward a solution I was looking for, but all I got were crickets. As you can see from the screenshot, I ultimately answered my own question.

7

What I eventually found were WebHooks. Unfortunately the term WebHook is just barely ubiquitous enough to make the term discoverable given the parameters I was searching for when I didn’t know what I was looking for. The term WebHook was coined by Jeff Lindsay, and he has a number of presentations available discussing webhooks.

Essentially, a WebHook is an HTTP POST used to notify a subscriber of an event that has occurred. It represents one way to begin enabling the real-time web. If a service does not offer WebHooks or some other type of notification service, consumers are forced to constantly poll the service looking for updates, which is sorely inefficient for both the publisher and subscriber. Not only is it inefficient, it also isn’t real-time. It’s only as timely as you are allowed to poll. Since WebHooks represent data that is pushed from a publisher to a subscriber, the subscriber will receive real-time data efficiently.

But what about RSS/ATOM some might say? With a feed I can subscribe to and consume data from a publisher and those specifications are widely implemented. The problem with feeds is that the publisher doesn’t send the feed data to a subscriber. The publisher simply publishes the data to a well known location and anyone interested in consuming it has to come fetch it. Thus, feed readers have to implement polling to fetch the data. Again, this means that the reader is sorely inefficient and you never get your data immediately after it has been made available.

8

Facebook’s realtime API uses webhooks and is loosely based on PubSubHubbub, the solution protocol that I will dive into in a bit.

Stripe has just released a newly updated events API. Since their API is strictly JSON, they have a different implementation than I am going to present on today, but the concept is the same.

Twillio uses WebHooks to send SMS messages.

GitHub uses WebHooks to implement git push.

9

Just recently the term “WebHook” seems to be starting to catch-on. While the concept isn’t anything really new, the usage of the term is. Also, there is no standard implementation of WebHooks. There are quite a few implementations out there but there is no standard pattern.

There have been a few attempts to create a standard. The webhooks.org wiki makes an effort at creating a specification for a RESTful implementation, however their attempt simply shows examples of how this can be done, and is hardly a specification.

One specification comes from the XMPP Standards Foundation. XMPP maintains standards for use in IM/real-time messaging scenarios. One of these specs is XMPP PubSub. XMPP maintains “pure” standards. Implementing XMPP PubSub yourself wouldn’t be a good use of most developer’s time since you would spend a lot of time just implementing the protocol. Yes, finding a library to implement would ease the pain, but then you have the other side of the fence, your consumers. Not only would you have to implement the protocol, but so would your consumers.

That’s where PubSubHubbub comes into play. PuSH is designed to be a “pragmatic” protocol. One that is much lighter weight and easier to implement than XMPP, but as

10

a result, isn’t as well defined on all fronts, which we will see shortly.

10

PuSH was developed by 2 Google engineers, Brett Slatkin and Brad Fitzpatrick. Brad is best known for creating both LiveJournal and memcached.

PuSH is a simple to use, simple to implement, topic-based publish/subscribe protocol based on ATOM/RSS. The goal of PuSH is to convert ATOM/RSS feeds into real-time data by eliminating the traditional polling that occurs to consume most feeds. While it’s not a RESTful protocol itself in-terms of the self-subscribe/unsubscribe specification, publishing is HTTP based.

PuSH is simple. It consists of 3 participants:

• Publisher

• Subscriber

• Hub

Together, these 3 participants can be combined to create a real-time messaging system that communicates over nothing but HTTP/S.

11

The link in this slide to the Subscription Flow, links to the PubSubHubbub’s project site with a slide deck describing how the subscription process works.

The XML snippet shows an example of what the <link /> node would look like in an ATOM feed that supports PuSH.

To subscribe, a subscriber makes an HTTP POST request to the link provided in the hub node. The hub will then verify the subscriber and return an HTTP error code that indicates if the subscription was successful or not. Hubs can verify subscriptions either synchronously or asynchronously. If the hub is using synchronous verification it will return a 204 “No Content.” If asynchronous, it will return a 202 “Accepted.” In this case, the hub will do what it needs to verify the subscription and will send a HTTP GET request to the subscriber indicating that the subscription has been accepted. If a subscription request is denied, appropriate codes in the 4xx-5xx range will be returned.

Callback authorization uses a challenge key to authorize the subscription

Subscribers can renew their subscription at anytime by re-subscribing

12

PuSH is designed so that all complexity exists in the hub. All a publisher has to provide is a link in their feed to their hub and then they need to negotiate with their hub on how they will notify the hub that they have updated content.

All the rest of the work is put on the hub. The hub is responsible for:

• accepting and verifying subscriptions

• managing the subscriptions

• handling update pings from the publishers

• extracting new and updated data from the publisher’s feed

• sending subscribers their new content

• DoS protections

Publishers may be their own hub, or they can use a commercially available hub. The protocol allows for publishers and hubs to negotiate how they are going to communicate. This means they get to choose the protocols and data formats they will use internally. As a matter of fact, Montage has taken some liberties with this flexibility in the protocol, which we will get into in a bit.

13

Thrift is a cross-language services development framework open sourced by Facebook.

13

Superfeedr is the largest public PuSH hub. Since PuSH is a pragmatic protocol it doesn’t define the implementation to a T. This allows hubs to take some liberties with their implementations, and Superfeedr has taken advantage of this.

• They have added digest notifications which will give subscribers a digest of their subscription, and they suggest it can be used as a heartbeat to ensure your subscription is still active.

• Feed status provides information on feed such as how much data was fetched, how many new entries there were and when the next fetch will occur.

• Virtual feeds allow subscribers to filter their feed at the hub.

http://boxcar.io/ - Instant Twitter, Facebook & Feed notifications to mobile devices

14

http://boxcar.io/

This diagram demonstrates the messaging infrastructure within the Montage Network.

At Montage we are currently maintaining two APIs. One is our Core API and the other is our Notifications API. The Notification API is nothing more than endpoints that respond to GET requests with an ATOM feed.

Since the PuSH spec leaves the communication between the publisher and the hub open for negotiation, we have taken some liberties that have allowed us to use push notifications, even though our entire infrastructure has not yet been converted to use real-time events. At this time, Montage’s main user interface has not yet been converted to use the API, so any interaction with the system through that interface is not setup to raise real-time notifications to the hub.

To compensate we have setup a polling service to pull data from the notification feeds. This polling actually accomplishes 2 needs. One is the aforementioned issue we have supporting a legacy application. The second, however, is that we use the polling service as a catch-all for any missed notifications. We have made a decision to centralize all complexity in the hub, which means the update pings from the Montage API are not reliable. If one fails we don’t care and we won’t retry. The polling service

15

will eventually pickup the missed event.

15

You may potentially be wondering what this fetch update is for. If we have polling in place, that will pull event data from the Montage API and if we support an update ping from the Montage API, what is this doing there?

There are 2 ways we could have implemented our update pings. The way we are doing it is that we use a ping that just tells the hub that new or updated data for a given client is available on a feed. The hub then takes this information and fetches the updated data. This allows the hub to control the amount of data that it’s fetching and allows it to control how it gets the data. Thus it could fetch the data immediately, or could wait if it wanted, but it has that flexibility. The other way it could be implemented would be with a fat ping. The Montage API could be configured to send a ping that includes the data being updated to the hub. This yields the benefit of reducing the callback step and could help with an accidental DoS on the Notification API if the hub got overloaded with pings that all required callbacks to the same feed endpoint. These are all great benefits and decisions we weighed and will continue to look at. However, for now our pattern is to yield full control to the hub, which is one reason why we have chosen the latter type of ping.

16

As I mentioned at the beginning, I was initially looking for a solution that would allow us to send notifications over HTTP/S, SMTP, SMS, etc. Since I was unable to find a tool or service I realized that PuSH could actually enable us to do this quite easily.

Our solution is that we are creating our own subscribers that will translate the notification POSTs into whatever protocol we ultimately want the notification delivered over. For example, when a recruiter sets up a live interview, emails need to be sent to the participants with the meeting link. What we do is we setup a subscriber for that client with a URL to our Email Service. The Email Service accepts a notification that a new interview has just been created, it merges the participants with an email template and sends the email to the recipient.

With this pattern, we can support any outgoing protocol. In addition, we also gain the benefit of allowing flexibility for our consumers. Some of our consumers would rather send out communications themselves. They have the flexibility to setup their own subscribers to do the translation and delivery of these messages if they desire.

17

As for what our endpoints look like, our pattern is to create an endpoint per notification type in the Notification API. Data is segregated by client and can be limited by the date/time of the notifications and can have count limits set by the take and skip parameters . By using a feed that supports subscriptions, we give our consumers flexibility in how they wish to consume notifications.

If they choose not to setup an endpoint that we can use to POST to for subscriptions, the consumer is able poll their feed for updates (as long as they don’t DoS us). However, as we would prefer they do, they may setup a subscription to any of the notification endpoints and we will notify them of new events.

18

This diagram represents the internals of the Montage Notification Hub.

The Montage Notification Hub is a service that accepts update notifications from the Montage Core API when events occur, fetches data that applies to these events and forwards the data to anyone who has registered as a subscriber for that data. Our hub maintains its own database of feeds and subscribers as well as any subscription verification data it may need to accept new subscribers.

There is one part of this diagram which refers to something that as of yet, has not been addressed, and that is message reliability. Just before I mentioned that we have pushed all reliability to the hub, and this is where it comes into play. We will be implementing a queuing system to give us message durability and reliability. The addition of this queue will allow our subscribers to configure what to do in the case of a failure. Will we try to resend? How many times will we try to resend, and how often? What happens when all that fails? The PuSH spec addresses reliability, but very loosely. “Hubs SHOULD retry notifications repeatedly until successful (up to some reasonable maximum over a reasonable time period).” Obviously the spec isn’t too helpful in this regard, so in our case, our intent is to leave the reliability configuration up to the subscriber.

19

As mentioned, the PuSH spec is pragmatic, not pure, and definitely doesn’t define every nuance of an implementation. One of those nuances is authentication.

The only thing close to authentication that PuSH addresses is transmission layer security and message security. Transmission security is easy and obvious, HTTPS, where possible. The spec does mention how to distribute content securely over an open connection, which involves encrypting the payload of the message.

What the spec doesn’t address is what if the subscriber requests that the hub authenticate with it before it will accept notifications? This may not seem like a big deal when you are using a hub to retrieve blog postings, but when you are transmitting sensitive data, authenticating with the subscriber isn’t a terrible thought. As a result of our client’s needs, we have taken our own liberties with the spec and have implemented the ability to authenticate with our subscribers. Our architecture should allow us to authenticate with any subscriber in the manner in which they would like us to authenticate.

HMAC Hashing

20

Ideally we would like to utilize a public hub or some other service to do this. While we think that the solution to our problem has been well designed, maintaining a hub and sending notifications isn’t our core business. We need it to support what we do, but actually making it happen is a distraction from our core functionality. For us to go to a service, the service is going to have to be flexible enough for us to accommodate the varying needs of our consumers, or we will have to work with our consumers to conform to the way that we send notifications.

While it’s ideal that we work with a service provider for our notifications, that doesn’t seem like it’s in the cards anytime soon. As a result, we are planning to implement queuing and are looking for both queuing tools and tools to manage reading items off the queue. Managing reading the items off the queue, again, isn’t core to what we do, so any package we can utilize to manage that will be in our best interest.

Self subscribe/unsubscribe will be added in the future allowing our clients to maintain what notifications they would like to receive.

Lastly, for simplicity all of our messaging is HTTP/S within the Montage Network, as well as the notifications that are sent to our consumers. A change we would like to make is to a lighter faster protocol internally, especially between the publisher and

21

the hub.

21

Webhooks, Asynchronous Web Applications and Push Notifications

Technology

Transcript of Webhooks, Asynchronous Web Applications and Push Notifications