Terraform at Scale

Post on 06-Jan-2017

50 views 1 download

Transcript of Terraform at Scale

Terraform at ScaleHashiconf

Calvin French-OwenCo-Founder of Segment

@calvinfo

September 7, 2016

💖

Scaling vectors

Complexity

People

Complexity

People

Complexity ❌

People

Complexity

✅

How do we move nimbly–while adding people?

This talk- Terraform at Segment- What makes “good” Terraform- What’s next

Terraform at Segment

By the numbers- 16 developers working with Terraform- 94 microservices- thousands of AWS resources

A year with TerraformDecember 2012 – Launch dayApril 2015 – Terraform first attempt (v1)November 2015 – Terraform “redux” (v2)

Before Terraform

😱

Terraform

Migrating to TerraformApril 2015

Migrating to Terraform

Migrating to Terraform1. AWS accounts per environment

dev stage prod old prodvpc peering

dev stage prod old prodvpc peering

managed by Terraform

Separate accounts- confidence to apply ‘at will’- test the waters without screwing up the old

account- any sort of ‘global’ configs are okay

Migrating to Terraform1. AWS accounts per environment2. Docker and ECS

Terraform: First Attempt

Terraform (our first attempt)├── Makefile├── README.md└── environments    ├── dev    ├── production    └── stage

Terraform (our first attempt)├── Makefile├── README.md└── environments    ├── dev    ├── production    └── stage

Terraform (our first attempt)environments/stage├── api.tf├── bastion.tf├── dns.tf├── elasticache.tf├── elbs.tf├── iam.tf├── outputs.tf├── redis.tf├── s3.tf├── terraform.tfstate├── terraform.tfvars└── vpc.tf

Terraform (our first attempt)resource "aws_ecs_task_definition" "app" { family = "app"

container_definitions = <<EOF[ { "cpu": 1024, "memory": 768, "environment": [ { "name": "NODE_ENV", "value": "stage" } ], "image": "segment/app:1.54.14", "name": "app", "portMappings": [ { "containerPort": 8000, "hostPort": 8000 } ] }]EOF}

Life was better

Life was better!Life was better…

Life was better!Life was better…

but notgood.

1. environment drift

Terraform first attempt├── Makefile├── README.md└── environments    ├── ops    ├── production    └── stage

resource "aws_ecs_task_definition" "app" { family = "app"

container_definitions = <<EOF[ { "cpu": 1024, "memory": 768, "environment": [ { "name": "NODE_ENV", "value": "stage" } ], "image": "segment/app:1.54.14", "name": "app", "portMappings": [ { "containerPort": 8000, "hostPort": 8000 } ] }]EOF}

<= stage

resource "aws_ecs_task_definition" "app" { family = "app"

container_definitions = <<EOF[ { "cpu": 1024, "memory": 768, "environment": [ { "name": "NODE_ENV", "value": "stage" } ], "image": "segment/app:1.54.14", "name": "app", "portMappings": [ { "containerPort": 8000, "hostPort": 8000 } ] }]EOF}

<= stage

resource "aws_ecs_task_definition" "app" { family = "app"

container_definitions = <<EOF[ { "cpu": 1024, "memory": 3072, "environment": [ { "name": "NODE_ENV", "value": "production”, } ], "image": "segment/app:1.54.17", "name": "app", "portMappings": [ { "containerPort": 8000, "hostPort": 3000 } ] }]EOF}

prod =>

2. one massive local state

3. production drift

$ terraform plan –target=aws_elb.feels_so_easy

$ terraform plan –target=aws_elb.oh_no_what_have_we_done

Terraform Redux (v2)

Terraform v1 Problems1. massive shared state2. locally stored state3. drift between environments

Terraform v1 Problems1. massive shared state: split states2. locally stored state: remote state3. drift between environments: modules

v2: state management

core(vpc, networking, security groups, asgs)

auth api site db cdn

services

core(vpc, networking, security groups, asgs)

auth api site db cdn

services→

read

onl

y →

/** * Remote state. */

resource "terraform_remote_state" "state" { backend = "s3" config { bucket = "segment-ops" key = "terraform/${var.environment}/terraform.tfstate" }}

data "template_file" ”test" { template = "${file("${path.module}/init.tpl")}"

vars { zone_id = "${terraform_remote_state.state.zone_id}" }}

/** * Remote state. */

resource "terraform_remote_state" "state" { backend = "s3" config { bucket = "segment-ops" key = "terraform/${var.environment}/terraform.tfstate" }}

data "template_file" ”test" { template = "${file("${path.module}/init.tpl")}"

vars { zone_id = "${terraform_remote_state.state.zone_id}" }}

read only!

/** * Remote state. */

resource "terraform_remote_state" "state" { backend = "s3" config { bucket = "segment-ops" key = "terraform/${var.environment}/terraform.tfstate" }}

data "template_file" ”test" { template = "${file("${path.module}/init.tpl")}"

vars { zone_id = "${terraform_remote_state.state.zone_id}" }}

read only!

reference

v2: modules

Modules enforce configuration parity.

What makes good* Terraform?

*for some definitions of good

Docker AMIs by Packer

Service Config by Terraform

1. Variables2. Composition3. State4. Versioning

1. Variables- anything a user might want to override should be

a variable- use defaults liberally

1. Variablesresource "aws_instance" "bastion" { ami = "${module.ami.ami_id}" source_dest_check = false instance_type = "${var.instance_type}" subnet_id = "${var.subnet_id}" key_name = "${var.key_name}" vpc_security_group_ids = ["${split(",",var.security_groups)}"] monitoring = true tags { Name = "bastion" Environment = "${var.environment}" }}

configurableconfigurable

configurableconfigurable

configurable

1. Variablesresource "aws_instance" "bastion" { ami = "${module.ami.ami_id}" source_dest_check = false instance_type = "${var.instance_type}" subnet_id = "${var.subnet_id}" key_name = "${var.key_name}" vpc_security_group_ids = ["${split(",",var.security_groups)}"] monitoring = true tags { Name = "bastion" Environment = "${var.environment}" }}

configurableconfigurable

configurableconfigurable

configurable

non-configurablenon-configurable

non-configurable

1. Variablesresource "aws_instance" "bastion" { ami = "${module.ami.ami_id}" source_dest_check = ${var.source_dest_check} instance_type = "${var.instance_type}" subnet_id = "${var.subnet_id}" key_name = "${var.key_name}" vpc_security_group_ids = ["${split(",",var.security_groups)}"] monitoring = ${var.monitoring} tags { Name = "bastion" Environment = "${var.environment}" }}

2. Composition- build modules as you need them- it’s okay if not everything fits the abstraction

2. Composition – “full stack”module “stack” { source = “github.com/segmentio/stack” name = “my-stack” environment = “production”}

2. Composition – inside stackmodule "vpc" { source = "./vpc” …}

module "security_groups" { source = "./security-groups” …}

module "bastion" { source = "./bastion” …}

module "dhcp" { source = "./dhcp” …}

2. Composition – byo editionmodule “cluster” { source = “github.com/segmentio/stack//ecs-cluster”

environment = “prod” name = “cdn” vpc_id = “vpc-eff2eada” image_id = “ami-204faaf3”}

3. State management- separate core from services- states per service- use atlas or s3- use binary plans

core(vpc, networking, security groups, asgs)

auth api site db cdn

services→

read

onl

y →

4. Versioningmodule “stack” { source = “github.com/segmentio/stack?ref=v1.x”}

What’s next

What’s next- Applying in CI- Atlas- Data sources- Terraform generation

People

Complexity

✅

Fin

Prior ArtStack: github.com/segmentio/stackAtlas Examples: github.com/hashicorp/atlas-examples