0 to 60 in 3_1 Presentation

8/8/2019 0 to 60 in 3_1 Presentation

1/21

Presented by,

MySQL AB & OReilly Media, Inc.

0 to 60 in 3.1

Tyler Carlton

Cory Sessions


2/21

Presented by,

MySQL AB & OReilly Media, Inc.


3/21

The Project

Medium sized demographics data mining

project

1,700,000+ User base

Hundreds of data points per user


4/21

Legacy System Why Upgrade?

+

Main DB (External Users)

Offline backup (Internal Users)

Weekly manual copy backups

Max of 3 simultaneous data

pulls

8hr+ data pull times forcomplex data pulls

Random index corruption


5/21

Notes:

Smaller is Better

On average, CPU usage with MySQL was

20% lower than our old database solution.


6/21

Why We Chose MySQL Cluster

Scalable

Distributed processing

5 9s Reliability

Instant data availability

between internal

& external users


7/21

8 Node NDB cluster

Dual Core 2 Quad 1.8 ghz

16 Gig ram (Data memory)

6x Raid 10 SAS 15k RPM drives

What We Built NDB Data Nodes


8/21

What We Built API & MGMT Nodes

3 API nodes + 1 management node

Dual Core 2 Quad 1.8 ghz

8 Gig ram

300 gig 7200rpm (Raid 0)


9/21

NDB Issues with a Large Data Set

NDB load times

Loading from backup: ~ 1 hour

Restarting NDB nodes: ~ 1 hour

Note: Load times differ depending on your data size


10/21

NDB Issues with a Large Data Set

Indexing Issues

Force index (NDB picks wrong)

Index creation/modification order matters (Seriously!)

Local Checkpoint TuningTimeBetweenLocalCheckpoints - 20 means 4MB (4

220) of write operations

NoOfFragmentLogFiles No. of 4 x 16MB files

None deleted until 3 local checkpoints

On startup: Local checkpoint buffers would overrun

RTFM (two, maybe three times)


11/21

NDB Network Issues

Network transport packet size

Buffer would fill and overrun

This caused nodes to miss their heartbeats and drop

This would happen when:

A backup was running

A local checkpoint was running at the same time

Solved by : Increasing network packet buffer


12/21

Issues - IN Statements

IN statements die with

engine_condition_pushdown=ON with a set of

apx. 10,000 or more. (caused with zip codes)

Really need engine_condition_pushdown=ON,

but this broke it for us, so we had to disable it.


13/21

Structuring Apps: Redundency

Redundant power supply + dual power sources

Port trunking w/ redundant Gig-E switches

# NDB Replicas: 2 (2x4 setup) 64 gig max

data size

MySQL (API Nodes ) Heads: Load balanced

with automatic fail over


14/21

Structuring Apps: Internal Apps

Ultimate goal: Offload the data intensive

processing to the MySQL nodes


15/21

The Good Stuff: Stats!

Queries per Second (over 20 days)

Average 1100-1500 Queries / Sec

during our peak times

Average 250 Queries / Sec


16/21

Website Traffic

Stats for March 2008


17/21

Net Usage: NDB Node

All NDB data nodes have nearly identical

network bandwidth usage

MySQL ( API ) Nodes

use about 9 MBs maxunder our current

structure

Totaling 75 MBsduring peak(600 Mbs)


18/21

Monitoring & Maintenance

SNMP Monitoring: CPU, Network, Memory, Load, Disk

Cron Scripts:

Node status &Node down notification

Backups

Database maintenance routines

MySQL Clustering book provided the base the scripts


19/21

Dolphin NIC Testing

4 node test cluster

4 x overall performance

Brand new patch to

handle automatic Ethernet

failover / Dolphin Fail Over

( beta as of March 28 )

Net Usage: Next steps


20/21

Questions?


21/21

Contact Information

Tyler Carlton

www.qdial.com

[email protected]

Cory Sessions

CorySessions.comOrangeSoda.com

0 to 60 in 3_1 Presentation

Documents

Transcript of 0 to 60 in 3_1 Presentation