BEST PRACTICES FOR RELIABLE CARRIER GRADE TELEPHONY Alistair Cunningham, Integrics Ltd.

8
BEST PRACTICES FOR RELIABLE CARRIER GRADE TELEPHONY Alistair Cunningham, Integrics Ltd.

description

BEST PRACTICES FOR RELIABLE CARRIER GRADE TELEPHONY Alistair Cunningham, Integrics Ltd. Reliability. Think people and culture, not technology. Complexity is the enemy. Discipline is the answer. Management must be willing to sacrifice features. - PowerPoint PPT Presentation

Transcript of BEST PRACTICES FOR RELIABLE CARRIER GRADE TELEPHONY Alistair Cunningham, Integrics Ltd.

Page 1: BEST PRACTICES FOR RELIABLE CARRIER GRADE TELEPHONY Alistair Cunningham, Integrics Ltd.

BEST PRACTICES FOR RELIABLE CARRIER GRADE

TELEPHONY

Alistair Cunningham, Integrics Ltd.

Page 2: BEST PRACTICES FOR RELIABLE CARRIER GRADE TELEPHONY Alistair Cunningham, Integrics Ltd.

Reliability

• Think people and culture, not technology.• Complexity is the enemy.• Discipline is the answer.• Management must be willing to sacrifice

features.• Reliability for all customers is more

important than winning one new customer.

Page 3: BEST PRACTICES FOR RELIABLE CARRIER GRADE TELEPHONY Alistair Cunningham, Integrics Ltd.

Staff Responibilities

• Assign a senior engineer as system manager.

• System manager has ultimate responsibility for whole system.

• Can delegate tasks to others.

Page 4: BEST PRACTICES FOR RELIABLE CARRIER GRADE TELEPHONY Alistair Cunningham, Integrics Ltd.

Cluster Architecture

• Duplicate all important functions. Use heartbeat, DRBD/GFS, application level load balancing. Remember utilities.

• Consistency between machines is vital.• Virtual machines have more outages.• Monitor all machines, services, and

resources.• Daily and monthly backups.

Page 5: BEST PRACTICES FOR RELIABLE CARRIER GRADE TELEPHONY Alistair Cunningham, Integrics Ltd.

Upgrades and Changes

• Risk is unpredicable and cumulative.• Many small changes are riskier than a few

large changes.• Test all changes on a staging machine

first.• Keep records of changes.• Consider change management system.• Keep customizations to a minimum.

Page 6: BEST PRACTICES FOR RELIABLE CARRIER GRADE TELEPHONY Alistair Cunningham, Integrics Ltd.

Dealing with Vendors

• Vendors can never substitute for system manager.

• Give vendors access to staging machines but not production.

• Your staff must have debugging skills.• Subscribe to security mailing lists.

Page 7: BEST PRACTICES FOR RELIABLE CARRIER GRADE TELEPHONY Alistair Cunningham, Integrics Ltd.

Causes of Outages

Most outages are caused by one of:

• Untested changes – use staging.• Hard disks filling up – use monitoring.• Power and network outages –

redundancy or split cluster.

Avoiding these three is usually sufficient to achieve good reliability.

Page 8: BEST PRACTICES FOR RELIABLE CARRIER GRADE TELEPHONY Alistair Cunningham, Integrics Ltd.

Questions?