Windows Azure Platform Technical Deep Dive - Chris Auld (Intergen)
-
Upload
spiffy -
Category
Technology
-
view
2.333 -
download
2
Transcript of Windows Azure Platform Technical Deep Dive - Chris Auld (Intergen)
1
Windows Azure Platform Technical Deep Dive
Chris J.T. AuldDirectorIntergen Ltd [email protected]/cauld
Notes Before We Begin
I don’t work for MicrosoftWill call a ‘spade a spade’ today
Azure is a young technologyBest practices are still emerging
We focus on architecture in this sessionA Few Demos only
No prior experience assumed
We have lots of attendees and just a little timeQuestions at the end please
TicketDirect: An example application
TicketDirect is a ticketing company in Australia and New Zealand
Ticketing is uniquely suited to the cloud
Will use TicketDirect as real world example today
4
TicketDirect Architecture
SQL Azure
Castellan
Castellan Venue DB
Castellan Venue DB’s
Venue 1Partition(s)
Castellan Venue DB
Castellan Venue DB’s
Venue 2 Partition(s)
Castellan Venue DB
Castellan Venue DB’s
Venue N Partition(s)
One application DB, many venue DB’s – each partitioned in to many parts (40+)
...
Azure Roles
http:// TicketDirect .* Dynamic Worker(tasks uploaded
as blobs)
PartitionerWorker
Azure StorageQueues for communication between clients and roles
-- - --- - --- - --- - -
Tables to record server & partition information
Blobs to store web and worker role resources
Client Applications
Castellan.old (VB6)
Castellan.Azure• Box Office sales• Ticket Printing • System Administration • Venue/Event Management• Partitioning
.Net Service Bus
WCF
On PremiseSQL Server
Castellan Venue
Distributed Cache Worker
5
A Global Hardware Platform
Global Foundation Serviceshttp://www.globalfoundatonservices.com
6 Azure Data CentersEurope - West/North
Asia – East/Southeast
USA – South Central/North Central
New containerised data centers
Approaching PUE of 1.2
Thousands of computation units
6
Intelligent Network Load Balancer
Stateless ‘Worker’ Machines
Shared Filesystem
Stateless Web and/or Application Servers
Partitioned RDBMS
‘NoSQL’Datastores
StateTier
Async Activation
Network Activation
The High Scale Application Archetype
Queues
Windows Azure provides a ‘pay-as-you-go’ scale out application platform
7
Azure Service Architecture
StorageTables
LB
Blobs
Worker Service
Worker Service
Worker RoleManaged
Interface Calll
Web Site(ASPX, ASMX, WCF)Web Site
(ASPX, ASMX, WCF)Web RoleIIS as Host
Queues
Windows Azure Data Center
LB
LB
The InternetThe Internet via TCP or HTTP
10
LB
LB
D
N
S
Service Deployment
YourService
FabricController
Web Portal(API)
Service ModelServiceService
DNSconfig
12
LB
LB
D
N
S
Service Update
YourService
FabricController
Web Portal(API)
Service ModelServiceService
DNSconfig
Service
Serviceproduction
staging
config
13
Upgrading Your Application
Two Models: VIP Swap and In-Place Upgrade
VIP Swap:Uses Staging and Production environments.
Allows to quickly swap environments.
Production: v1 Staging: v2, after swap then Production: v2 Staging: v1.
In-Place UpgradePerforms a rolling upgrade on live service.
Entire service or a single role
Manual or Automatic across update domains
Cannot change Service Model
15
Configuration
Service ConfigurationServiceconfiguration.csdef – Service Model
ServiceConfiguration.cscfg – instance data
RoleEnvironment.GetConfigurationSettingValue()
Don’t use web.config for values you wish to change at runtime
Web.config change requires re-deploy
16
LB
LB
D
N
S
Service Scaling
YourService
FabricController
Web Portal(API)
Service
Service
Service
Model
Service
Service
ServiceService
Service
Rule Based Auto- Scaling
Use Service Management API
Predicable or Periodic DemandTime based rules
Unpredictable demandMonitor metrics and react accordingly
Monitor metrics
Primary metrics (actual work done)Requests per Second
Queue messages processed / interval
Secondary metricsCPU Utilization
Queue length
Response time
Derivative metricsRate of change of queue lengthUse ‘historical’ data to help predict requirements
Evaluating Business Rules
Are requests taking too long?
Do I have too many jobs in my queue?
How much money have I spent this month?
Could write these into code.
Could build some sort of rules engine.
Could use WF rules engine.
Take Action
Add/Remove InstancesUse Service Management API
Don’t forget billing window is 1hr
Change role sizeRequires change to *.csdef
Most suited to Worker Roles
Send notificationsEmail
IM
Manage momentum
21
LB
LB
D
N
S
Service Monitoring & Recovery
YourService
FabricController
Web Portal(API)
Service
Service
Service
Model
Service
22
MonitoringNo Debugging in Cloud
Instrument your application using Trace, Debug
Use Diagnostics API to Configure and CollectEvent Logs
Performance Counters
Trace/Debug information (logging)
IIS Logs, Failed Request Logs
Request data on demand or scheduledTransferred into your table and/or blob storage
Everything is remotely configurable
24
Storage
Scalable storage in Azure Datacenter100tb per storage account
Accessible via RESTful Web Service APIAccess from Azure Compute
Access from anywhere via internet
Supporting .NET Client Library
Various storage typesTable
Queue
BlobDrives
25
StorageTables
Table = group of entities
Entity = name/value pairs
Partitioned by keyScale out to Bns of entities
Not an RDBMS
BlobsLarge binary storage
Stored in container
Unlimited containers
CDN Deliverable
QueuesSimple message queue
Not transactionalRead at least once
Delete to remove message, otherwise is returned to queue
Partitioned by Queue Name
26
Web Role Worker Role
StorageQueue
LB LB
Worker RoleWorker Role
Web RoleWorker Role
Blob Container
Table
30mb JPEG
1. User uploads large image file2. Image inserted into blog storage3. Message placed on queue incl BLOB URI and metadata4. Worker role is polling queue. Reads message from queue5. Worker role processes message, reads from BLOB storage, generates thumbnail6. Thumnail and metadata stored in Table storage7. Message deleted from queue
30mb JPEG
Using Queues for Async Processing
28
First Step For Software Architects
If you expect to write documents mentioningidempotencyOpen wordType idempotencyRight clickChoose add to dictionary
WARNING: Failure to follow these steps will surely see you sending an important architecture and design document to a client with the ‘corrected’
spelling of the word...impotency
29
Messages Process At Least Once
Web Role Worker Role
StorageQueue
LB LB
Worker RoleWorker Role
Web RoleWorker Role
1. Debit bank account $100 message2. Worker role reads message3. Balance debited $1004. Worker role is torn before message can be deleted5. 3 minutes later, message re-appears on queue6. Worker role reads message7. Balance debited $1008. Message deleted from queue9. Chaos ensues.....10. Customer calls bank.....
Balance = $1000Balance = $900Balance = $800
30
Solving The Idempotency Problem
Web Role Worker Role
StorageQueue
LB LB
Worker RoleWorker Role
Web RoleWorker Role
1. Debit bank account $100 message with transaction ID2. Worker role reads message. Checks transaction ID not present.3. Writes transaction ID with state ‘Started’ to ‘Replay Log’4. Balance debited $1005. Worker role is torn before message can be deleted6. 3 minutes later, message re-appears on queue7. Worker role reads message. Checks transaction ID. It is present in state started.8. Compensating message written to another queue9. Message deleted from queue10. Compensatory message processed.
Balance = $1000Balance = $900
Table
QueryQuery
Queue
32
Azure Table Storage – Key Points
Partition Key is the killer featurePartitions are Auto-Balanced
No need to partition into equal bins
Hot partitions may be scaled upAzure fabric may dedicate more resources to partitions with high Tx load
Partition Key AND Row Key = Primary KeyMust include PartitionKey for Create,Update,Delete
Select queries across partitions parallelized, resource intensive and potentially more expensive!
33
Azure Table Storage – Key Points
Continuation Tokens May Be Returned from Cross Partition Queries
Any query not including the PartitionKey needs to handle Continuation tokenshttp://tinyurl.com/ContToken
Key Columns Up to 1KB in sizeShould aim to keep to 260 char URI limit
Be aggressive e.g. Only ever query by an ID?RowKey = PartitionKey
All queries should include partition key
Azure Tables != RDBMS
Storage is cheap
Cross partition queries are resource intensive
De-normalization and massive duplication often name of the game
37
E.g. Tweet Storage
Tweet
…
Message
TweetWord
TweetID
WordID
Word
WordID
Word (IX)
Tweet
…
Message
Word
WordID
Word (IX)
38
E.g. Tweet Storage
Tweet
TweetID (RK)
UserID (PK)
DateTimeStamp
Message
TweetIndex
TweetID (RK)
UserID
DateTimeStamp
Message
Word (PK)
39
E.g. Tweet Storage
Tweet
TweetID (RK)
UserID (PK)
DateTimeStamp
Message
MentionIndex
TweetID (RK)
UserID
DateTimeStamp
Message
UserID (PK)
40
PricingCompute
$0.12 / CPU hour (or part thereof)~ 1.7 GHz, 2GB Ram, Single Core
$2.88 / Day
$86.4 / 30 days (billing period)
2 instances = $172.80 / month
Storage$0.15 / GB/Month
$.01 / 10,000 calls to storage web service
Bandwidth $0.30 /GB inbound to Asian datacenters
$0.45 /GB outbound from Asian datacenters
41
Design Considerations
Scale and availability are the design points
Storage isn’t a relational database
StatelessStateless front ends, store state in storage
Use queues to decouple components
Instrument your application (Trace)
Once you are on - stay on
Think about patching & updates
SQL Azure
Initial ServicesDatabaseCore SQL Server database capabilities in cloud optimized topologyHighly compatible with on premise SQL Server
Future ServicesData Sync – Enables the sync framework
Additional SQL Server capabilities available as a service: Business Intelligence and Reporting
BusinessIntelligence
ReportingData Sync
SQL Azure Details
SQL Azure provides logical SQL ServerGateway server that understands TDS protocol
Looks like SQL Server to TDS Client
Actual data stored on multiple backend data nodes
Logical optimisations supportedIndexes, Query plans etc..
Physical optimisations not supportedFile Groups, Partitions etc…
SQL Azure transparently manages physical storage
48
Hardware Boundary
Hardware Boundary
Hardware Boundary
Hardware Boundary
Shared Environment
BC
D
A A
B
B
C
C
D
D
A
50
Design Considerations
1 x 10GB database1 Instances
10 x 1GB databases10 Instances
Partition forData volume
Query load
51
SQL Azure – Key Points
Partition for Data volume > 10GB
Transaction throttle (non deterministic)Always code for retry
All partition logic up to the developerAlgorithmic
Lookup based
Partitions are not Auto-BalancedNeed to aim for ‘equal’ partitions
‘Equal’ not necessarily the same size
52
Choosing a Partition Key
Natural KeysCountry
First letter, last name
Date
MathematicalHash functions
Modulo operator
Lookup BasedLookup table to resolve value to partitions
53
Using Modulo
The remainder of a division
Nice properties for partitioning:Given two positive integers M and N
M mod N will return a number between 0 and N-1
Want equi-sized partitions?Given an appropriate distribution of M we will get N ‘equally full’ buckets.
54
Using Hash Values
Using A Hash Function Projects One Distribution into Another
Use a hash function that projects a random distribution
Do NOT use a cryptographic hash function
Plenty of choice on the webhttp://tinyurl.com/part-hash
Be careful if using Object.GetHashCode()
56
Just in time Partitioning
In SQL Azure Partitions Cost Money
In highly elastic scenarios partitions may be needed for just a few hours or days
If load is predictablePartition before load commences
De-partition after load has subsided
57
Pricing
WEB Edition
1 GB Database
$9.99 / month
Bandwidth$0.10 /GB inbound
$0.15 /GB outbound
Business Edition
10GB Database
$99.99 / month
Bandwidth$0.10 /GB inbound
$0.15 /GB outbound
• Pro rated by the day or part thereof• Can move up and down between sizes• SQL Azure has no query charge• Excessively long transactions or high query load may result in throttling• 50GB database size in Beta
58
Windows Azure Platform AppFabric
Service Bus: General purpose application messaging bus
Access Control:Rules-driven, claims-based access control
Extending .NET to the cloud with Internet Scale Utility Services
Simplified, Secure Connectivity for the CloudService Bus and Access Control in Windows Azure platform
AppFabric are powerful building blocks.
AppFabric SERVICE BUSConnect apps & services
AppFabric ACCESS CONTROLControl & secure access
Secure Connectivity• Bridge cloud services, on-premises apps, and hosted
assets• Build distributed apps for your business or to
collaborate with partners
Across boundaries• Navigate network and security boundaries, securely
and simply• Federate identity and access across organizations
and ID providers• Simplify claims-based authorization for distributed
apps and web services
At Cloud Scale• Scale up and down as your business requires• Automated service mgmt. and dynamic scale• Interoperate with a variety of languages and
industry standards
AppFabric Service Bus Connectivity
Application #1 Application #2Direct Connection facilitated by
Service Bus if that is best
connection mechanism.
Text
XML
Graphics
Binary Data
Streaming
Firewall
Send
Receive
Exchange messages between loosely
coupled, composite applications.
Send
Receive
Architecture of AppFabric Access Control
Your Access
Control Project
Your App
(Relying Part)
1. Define access
control rules
6. Check for claims
User
(Application)5. Send token with request
0. Trust exchanged;
secrets, certs
2. Send token (initial
claims; e.g. identity)
4. Return token
(output claims from 3)
3. Map input claims to
output claims based
on access control rules
63
Pricing & SLA
$1.99 / 100k ACS transactions
Connections$3.99/connection/month
Packages available
Bandwidth$0.10 /GB inbound
$0.15 /GB outbound
TicketDirect: An example application
TicketDirect is a ticketing company in Australia and New Zealand
Ticketing is uniquely suited to the cloud
Will use TicketDirect as real world example today
65
TicketDirect Architecture
SQL Azure
Castellan
Castellan Venue DB
Castellan Venue DB’s
Venue 1Partition(s)
Castellan Venue DB
Castellan Venue DB’s
Venue 2 Partition(s)
Castellan Venue DB
Castellan Venue DB’s
Venue N Partition(s)
One application DB, many venue DB’s – each partitioned in to many parts (40+)
...
Azure Roles
http:// TicketDirect .* Dynamic Worker(tasks uploaded
as blobs)
PartitionerWorker
Azure StorageQueues for communication between clients and roles
-- - --- - --- - --- - -
Tables to record server & partition information
Blobs to store web and worker role resources
Client Applications
Castellan.old (VB6)
Castellan.Azure• Box Office sales• Ticket Printing • System Administration • Venue/Event Management• Partitioning
.Net Service Bus
WCF
On PremiseSQL Server
Castellan Venue
Distributed Cache Worker
66
Windows Azure Platform Benefits
The Cloud
New Economic ModelLow Capex
Pay as you Go
Elastic ScaleOnly solvable via Cloud
Global DistributionGlobal data centers
Windows Azure
High Level of AbstractionHardware
Server OS
Network Infrastructure
Web Server
AvailabilityAutomated Service Management
Azure CDN
ScalabilityInstance & Partitions
Developer ExperienceFamiliar Developer Tools
67
Windows Azure Platform Benefits
AppFabric
High Performance MessagingMassively scalable
HTTP and Raw TCP
Access ControlLess brittle apps due to factoring out rules
Developer ExperienceFamiliar Developer Tools
WCF bindings
SQL Azure
Higher Level of AbstractionHardware
Server OS
Network Infrastructure
Database Server
AvailabilityAutomated Database Management & Replication
ScalabilityDatabases Partitioning
Developer ExperienceFamiliar SQL Environment