Cloud Solution Day 2016: Microservices on Mesos & Netflix OSS
-
Upload
aws-vietnam-community -
Category
Technology
-
view
95 -
download
0
Transcript of Cloud Solution Day 2016: Microservices on Mesos & Netflix OSS
Microservices on Mesos & Netflix OSS
Before we start(or ground rules)
● Forgive my shaking voice
● Feel free to tell me I’m stupid
● Feel free to interrupt me for question
Agenda
● About Hoiio Stack
● Phoenix Deployment
○ Mesos/Marathon
● Netflix OSS @ Hoiio
○ Service Discovery
■ Consul
■ Eureka
○ API Gateway
● (Monitoring)
About Hoiio
● VoIP/SIP
● SMS/Email
● Connected Apps
● HR suite
Services
● Distributed multi-service Micro-services
● RabbitMQ as message broker Peer-to-peer via HTTP
● Kafka as event-store
● JAVA/Python/... modules
● Entirely on AWS
Platform
zuul
auth
billingsip
HTTP
Phoenix DeploymentMesos/Marathon
● Abstract cluster of machines to a single “black-box” machine
● Master nodes, Slave/Agent nodes
● Tasks are submitted to master
● Master schedules job to one of the slaves
Mesos
http://www.slideshare.net/spodila/prezo-tovmware-june2015
● Framework running on top of Mesos
● Manage tasks config, number of instance,...
● Healthcheck
● REST interface
● Mesos as OS, Marathon as Task Manager
Marathon
● Framework running on top of Mesos
● Manage tasks config, number of instance,...
● Healthcheck
● REST interface
● Mesos as OS, Marathon as Task Manager
MarathonMesos Slave
Mesos Master Marathon
CPU/Memory
Kernel Scheduler
Task Manager
● Docker as container
○ Supported by Mesos
○ Use AWS ECR as private repo/ Private repo running on Marathon
● Marathon performs healthcheck and replaces unhealthy instances
● Replacement takes seconds!
Phoenix?
{ "id": "ms-uat-xxx", "mem": 384, “cpu”: 0.5, ...
Service DiscoveryNetflix Eureka
● Eureka Server & Client
● Server route are replicated
● Each Client hold a copy of route table
● Route table are updated in background
https://github.com/Netflix/eureka/wiki/Eureka-at-a-glance
● Eureka
○ Eureka server tracks which service
is running where (which ip and port?)
○ All records are replicated to all eureka-clients
● Ribbon
○ Pick a server from records replica on local eureka-client
○ Make request to picked server
○ Retry if configured
10.0.12.16:1234 10.0.140.21:4321
10.0.140.26:6789
Eureka
10.0.12.16:1234
10.0.140.21:4321, 10.0.140.26:6789
10.0.12.16:1234
10.0.140.21:4321, 10.0.140.26:6789
10.0.12.16:1234
10.0.140.21:4321, 10.0.140.26:6789
10.0.12.16:1234
10.0.12.16:1234
10.0.140.21:4321, 10.0.140.26:6789
R10.0.140.21:4321
10.0.140.21:4321
Auth1
Routes
● Single-point-of-failure? Not really
○ Route table are replicated
○ Each client has a copy
○ Routes are queried from local copy
● When Eureka is down
○ New servers are not updated
○ Might call to a dead server ->
retry on local server list with Ribbon
SIP
Auth2
HTTP
Routes
EurekaServer
Routes
Routes
String moduleVipAddress = "call.hoiio.info"
Observable<HttpResponse> response = HoiioRibbonRequest.getInstance().makeRequest(
moduleVipAddress,
UUID.randomUUID().toString(),
httpRequest);
● Timeout and Retry
○ Defined in HoiioRibbonRequest
○ Default:
■ Timeout: 10s
■ Retry:
● Same server: 0
● Next server: 3
○ Can be re-configured
10.0.12.16:1234
10.0.12.16:1234
10.0.140.21:4321, 10.0.140.26:6789
R10.0.140.21:4321
10.0.140.21:4321
HttpClient httpClient;
RetryPolicy retryPolicy;
String moduleVipAddress = "call.hoiio.info"
Integer sameServerRetry = 1
Integer nextServerRetry = 1
retryPolicy = new RetryPolicy(
new DefaultLoadBalancerRetryHandler(
sameServerRetry,
nextServerRetry,
true
)
)
}
Integer timeout = 60;
httpClient = new HttpClient(500, 50, timeout*1000);
Configuration config = new Configuration(moduleVipAddress, httpClient, retryPolicy);
Observable<HttpResponse> httpResponse = HoiioRibbonRequest.getInstance().makeRequest(
config,
correlationId,
httpRequest);
Service DiscoveryConsul
● Clustering with agent on each instance
● Service info is shared in cluster
● Agent has REST interface to register/deregister/checks/query/…
● Zuul-pronted as primary reversed proxy
Implementation
service.json
service.json
Zuul
HoiioConsulLoadBalancer lb = new HoiioConsulLoadBalancer(appName, ConsulService.Info.environment(), tag);
HttpResponse httpResponse;
try {
httpResponse = lb.execute(new HttpCmd(httpRequest))
} catch (NoServerException ignored) {
ZuulLogger.logger.error("No server for " + appName)
httpResponse = responseFactory.get().newHttpResponse(
new BasicStatusLine(HttpVersion.HTTP_1_1, 503, "Service not available"),
null);
}
API Gatewaywith Netflix Zuul and Archaius
● Single gateway for API
● API mapping for easy understanding
● Optimize number of request called
● Reject malformed request
Problems
sms
auth
billingsip
HTTP
● Why Zuul?
○ Apps does not have Eureka Client
○ Cron jobs
○ Exposing API
● What Zuul does
○ Represent API caller (Apps,
Cronjob, Partner,...) to talk to modules (act as a proxy)
■ Relay request
■ Retry
○ Authenticate request
10.0.12.16:123410.0.140.21:4321
10.0.140.26:6789
Eureka
10.0.12.16:1234
10.0.140.21:4321, 10.0.140.26:6789
10.0.12.16:1234
10.0.140.21:4321, 10.0.140.26:6789
10.0.12.16:1234
10.0.140.21:4321, 10.0.140.26:6789
10.0.12.16:1234
10.0.140.21:4321, 10.0.140.26:6789
10.0.12.16:1234
10.0.140.21:4321, 10.0.140.26:6789
Z
Z/a/b/c
10.0.140.26:6789
10.0.12.16:1234
/a/b/c -> /a/c
/a/c
Microservice
● Pre, Route, Post Filter
○ Groovy filter
○ Has priority
● Integrate with Archaius for Dynamic configuration
● Integrate with Eureka/Consul for service discovery
Netflix ZuulReject malformed
Authenticate
Route using Eureka
Ribbon/Eureka
Add header
pre
route
post
Archaius
Route mapping
/sms/send /sms/send -> {“module”:”sms”, “uri”:”sendOneSms”}
/sendOneSms
● Timeout and Retry
○ Zuul represents API callers to talk
to modules -> must tell Zuul timeout and retry for each API
○ Default values
■ Timeout: 10s
■ Retry:
● Same server: 0
● Next server: 3
{
"vipAddress": "auth.hoiio.info",
"module": "auth",
“apis”: [
{
"from":"/v1/otp",
"to": "/private/v1/otp",
"type": "private",
"timeout": 60,
"retry": {
"same": 1,
"next": 2
}
}
]
}
Monitoring
● Remember Consul?
● Consul watch
○ Trigger action when a service status changes
Service status
service.json
{ "service": { "name": "MS-Apps-1-46", "tags": ["prod"], "address": "10.0.14.10", "port": 8080, "checks": [ { "script": "/opt/consul/bin/MS-Apps-1-46-healthcheck.sh",
"interval": "60s" } ] }}
● Metric sources:
○ CollectD/cAdvisor
○ Cloudwatch
● Metric storage:
○ InfluxDB
● Visualization:
○ Grafana
Instance stats
Kapacitor
Cloudwatch
CAS slack/sms
Thank you!
We are hiring!
● Fresh web engineer● Senior web engineer● Internship