The Wix Microservice Stack

36
The Wix Microservice Stack Tomer Gabel, Java.IL Meetup February 2016

Transcript of The Wix Microservice Stack

The Wix

Microservice StackTomer Gabel, Java.IL Meetup

February 2016

Microservices are a-buzzin’

Focal points:

1. Topology

2. Networking

3. Structure

4. Operations

Our conceptual system

Store Service

Checkout

ServiceCart Service

1. TOPOLOGY

Our conceptual system

Store Service

Checkout

ServiceCart Service

Host A

Host B Host C

Service Scheduling

Topology

Service→host

mapping

Server inventory

Service catalogue

Formally,

“scheduling”

Service Scheduling

• A hard problem!

• Multiple dimensions:

– Resource utilization(disk space, I/O, RAM,

network, power…)

– Resource availability

– Failover(physical server, rack, row…)

– Custom constraints(zoning, e.g. PCI compliance)

Service Scheduling

• A hard problem!

• Multiple dimensions:

– Resource utilization(disk space, I/O, RAM,

network, power…)

– Resource availability

– Failover(physical server, rack, row…)

– Custom constraints(zoning, e.g. PCI compliance)

Service Scheduling

• The middle ground:

– Naïve automatic

scheduler

– Human-configured

overrides for zoning,

optimization

• Easy but limited scale

– A few hundred servers

Our conceptual system

Store Service

Checkout

ServiceCart Service

http://err:42/uh

… derp?

Service Discovery

Static Dynamic

Logical

Physical

That way

madness lies

Service Discovery

Static Dynamic

Logical

Physical

Service Discovery

Static Dynamic

Logical

Physical

In practice

• Static topology

– Managed with Frying Pan

– Exported to Chef

– Deployed via

configuration files

• Live registry in

Zookeeper

– Deployment only

– … for now

2. NETWORKING

Back to diagrams

Store Service

Checkout

ServiceCart Service

Back to diagrams

Store Service

Checkout

ServiceCart Service

Protocol

Protocol

• Style– RPC

– Message passing

• Transport– TCP

– HTTP

• Serialization– JSON

– Protocol Buffers

– Thrift

– Avro

Protocol

• Style– RPC

• Transport

– HTTP

• Serialization– JSON

Load balancing

• Centralized

– Simple

– Limited flexibility

– Limited scale

– Thin implementation

highly portable

– Suitable for static

topologies

• Distributed

– Highly scalable

– Flexible

– Fully dynamic

– Fat implementation

difficult to port

• Quasi-distributed

– e.g. Synapse

– Best of both worlds?

Frying Pan

Chef

Nginx

To our shame

• There’s always IDL.

• Informal– Usually ad-hoc

documentation

• Formal– Swagger, Apiary etc.

– ProtoBuf, Thrift, Avro

– WSDL, god forbid!

• … or– Ad-hoc

public interface SiteMembersService {

SiteMemberDto getMemberById(

Guid<SiteMember> memberId,

UserGuid userId);

SiteMemberDto getMemberOrOwnerById(

Guid<SiteMember> memberId,

Guid<SMCollection> collectionId);

SiteMemberDto getMemberDtoByEmailAndCollectionId(

String email,

Guid<SMCollection> collectionId);

List<SiteMemberDto> listMembersByCollectionId(

Guid<SMCollection> collectionId);

}

In Detail

• Java interfaces?

+ Ridiculously simple

+ Lend well to RPC

– Coupled to JVM

• JSON serialization

+ Jackson-based

+ Custom, extensible

mapping

– Reflection-based

• Server stack (JVM)

– Jetty

– Spring + Spring MVC

– Custom handler

• Client stack (JVM)

– Spring

– Proxy classes

generated at runtime

– AsyncHttpClient

Cascade Failures

• What is a

cascade failure?

• Mitigations

– Bulkheading

– Circuit breakers

– Load shedding

• We don’t do any

of that.

Does it go?

• Short answer: yes.

• Battle-tested– Evolving since 2010.

– 200 services in production.

• Known quantity– Easy to operate

– Performs well enough

– Most problems have easy workarounds

Not all is well, though

• Polyglot development

– Custom client stack

– Expensive to port!

• Implicit state

– Transparently handled

by the framework

– Thread local storage

– Hard to go async!

Client Proxy

Service A

Service B

Session info

Session info

Transaction ID

Session info

Transaction ID

A/B experiment

Transaction ID

A/B experiment

3. STRUCTURE

Codebase modeling

• A product comprises multiple services

• Services have dependencies– Creating a DAG

– Tends to cluster around domains

• Org structure reflects the clustering (Conway)

Codebase modelingRepository-per-

Service

• Small repositories

• Artifacts built

independently

• Binary dependencies

• Requires specialized

tools to manage:

– Versions

– Build dependencies

Mono-repo

• Repository contains

everything

• Code is built atomically

• Source dependencies

• Requires a specialized

build tool

At Wix

• One repo per domain

• Dependencies:

– Declared in POMs

– Version management

via custom plugin

– Builds managed by

custom tool*

• Custom dashboard,

“Wix Lifecycle”

* Lifecycle – Dependency Management Algorithm

Version management

[INFO] QuickRelease/home/builduser/agent01/work/d9922a1c87aee4bb bf1bc8bcfb2eccebc4268651c5f19faa689be6e4

[08:10:55][INFO] Adding tag RC;.;1.20.0

[08:10:56][INFO] Tag RC;.;1.20.0 added successfully

[08:10:56][INFO] Working on onboarding-server-web

[08:10:56][INFO] onboarding-server-web-1.19.0-SNAPSHOT jar deployable copied

[08:10:56][INFO] onboarding-server-web-1.19.0-SNAPSHOT jar sources copied

[08:10:56][INFO] onboarding-server-web-1.19.0-SNAPSHOT jar copied

[08:10:56][INFO] onboarding-server-web-1.19.0-SNAPSHOT jar tests copied

[08:10:56][INFO] onboarding-server-web pomdeployed

[08:10:57][INFO] Deploying artifacts to release artifacts repository

[08:10:57][INFO] Deploying onboarding-server-web to RELEASE

[08:10:57][INFO] pushing new pom

[08:10:59]2016-02-22 08:10:39 [INFO ] /usr/bin/gitpush --tag origin master exitValue = 0

• All artifacts share a

common parent

– Master list of versions

• Manually-triggered

release builds

– Custom release plugin

– Increments version

– Updates master

– Pushes changes to git

4. OPERATIONS

Back to diagrams

Store Service

Checkout

ServiceCart Service

How ya

doin’?

Health

• Host monitoring

– Nagios Sensu alerts

– Usual host metrics

– Health-check endpoint

in framework

• End-to-end

– Pingdom

• Business

– Custom BI toolchain

Instrumentation

• Metrics– Codahale Metrics

– Reporting toGraphite and Anodot

– Implicit collection (e.g. RPC)

– APIs for custom metrics

• Alerts via Anodot

• Custom NewRelicerror reporting

Debugging

• Logs

– Good old Logback

– No centralized

aggregation

– Not particularly useful

• Feature toggle

overrides

• Distributed tracing

WE’RE DONE HERE!… AND YES, WE’RE HIRING :-)

Thank you for listening

[email protected]

@tomerg

http://il.linkedin.com/in/tomergabel

Wix Engineering blog:

http://engineering.wix.com