AWS re:Invent 2016: Life Without SSH: Immutable Infrastructure in Production (SAC318)

65
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Martin Sirull, AWS Professional Services Mirza Baig, Experian Consumer Services December 1, 2016 SAC318 Life Without SSH Immutable Infrastructure in Production

Transcript of AWS re:Invent 2016: Life Without SSH: Immutable Infrastructure in Production (SAC318)

© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Martin Sirull, AWS Professional Services

Mirza Baig, Experian Consumer Services

December 1, 2016

SAC318

Life Without SSHImmutable Infrastructure in Production

On today’s show…

Martin’s gonna talk about why we deployed an application

in production without SSH keys. And then dive into how it

got deployed.

Mirza’s gonna talk about how Martin’s points above

impacted (or didn’t) development and then how the

production environment was monitored.

Reference application

• Experian.com

• 10+ million users

• 100,000+ requests per hour

• PCI-compliant environment

What are the network security threats?

Open Ports

DDOS

SQL Injection

XSS

CSRFPoodle

Heartbleed

Challenges of SSH

SSH tunnels

• Forward tunneling

• Reverse SSH tunneling

• Easy to circumvent firewall rules

Key management

• Where do you store them? Can you control storage?

• Rotation of keys?

• Federation? (Centrify, etc)

Did you know?

Immutable infrastructure possible?

What’s truly immutable infrastructure?

What’s practically immutable infrastructure?

What do we want?

Photo by Jurvetson (flickr)

AUTOMATE

EVERYTHING!

Key goals

• No humans in production

• Everything has to be automated

• No SSH back doors into production

• Development has to be: Easy, fast, secure. Pick three

Ask 2 questions Instead

How are we going to get changes into the pipeline?

How are we going to automatically get the data we need off the box?

What does our target environment need?

How are we going to automate?

AMI (image) baking!

The pipeline

AWS

CodeCommit

Amazon ECS

Build/test

Deploy

Redeploy to next

environments

Git clone

What is AWS CloudFormation?

CloudFormation

template

CloudFormation

stack AWS resources

What is AWS CloudFormation?

What goes in AWS CloudFormation?

• Amazon S3 buckets

• Amazon DynamoDB tables

• Amazon SQS

• Amazon RDS databases

• Amazon ElastiCache

instances

• AWS KMS keys

• IAM roles

• IAM policies

• Amazon CloudFront

• Amazon VPC

• Internet gateway

• Routes

• Route tables

• Network ACL

• Front-end router/ELB

• Internal ELB

• Auto Scaling group

and metrics

What is AWS CloudFormation?

How do we make it easier for developers?

{"ServiceName": ”MyAwesomeService","DeploymentSystem": ”ECS","DeploymentType": "Python","Port": 8080,"RootDir": ”helloworld”,

"APIGateway": "True"

}

How do we make it easier for developers?

{"Resources": {"KMS": [{"logical_id": "DefaultKey"

}],"S3": [{"logical_id": "StandardBucket"

} ],"Dynamo": [{"logical_id": "table","hash": "hash","range": "range"

}}

What does our target environment need?

Base instance configuration: cfn-init

{ "Resources": {

"MyInstance": { "Type": "AWS::EC2::Instance", "Metadata": {

"AWS::CloudFormation::Init": {"config": {

"packages": {},"groups": {},"users": {},"sources": {},"files": {},"commands": {},"services": {}

}}}}}}

Implications on development

The initial reaction

So you’re telling me that

we are rolling a brand new

platform out to production,

with 100s of instances,

and we can’t log in to a

single one?

What does our target environment need?

App-specific instance configuration: AWS CodeDeploy

Developer view of AWS CodeDeploy

How to debug code deployments?

How do we configure the application?

The road to self-discovery – Step 1

The road to self-discovery – Step 2

The road to self-discovery – Step 3

Configuration properties

• Feature flags

• Thread pool sizing

• ListenPort

Secure configuration repository

• Consul

• Spring cloud config

• Custom solution• DynamoDB

• Amazon S3

How about a developer’s config?

Challenges with instance bootstrapping?

• Dependency issues with package installation at runtime

• Potential vector for malicious code injection?

• Automatic scaling slower with a full bootstrap

Can we combine these layers?

What is Docker?

How to get started?

FROM ubuntu:trustyEXPOSE 80RUN apt-get updateRUN apt-get install -y python3-setuptoolsRUN easy_install3 pipRUN pip3 install flaskADD . /home/rootCMD python3 /home/root/hello_world.py

How to get started?

FROM ubuntu:trustyEXPOSE 80RUN apt-get updateRUN apt-get install -y python3-setuptoolsRUN easy_install3 pipRUN pip3 install flaskADD . /home/rootCMD python3 /home/root/hello_world.py

How about the external environment?

Implications on development – Environment

configuration

What do we typically need to know about the outside world?

• Database tables

• Amazon SQS queues

• Encryption keys

• Amazon S3 buckets

• Amazon SNS topics

• Amazon Kinesis streams

• Amazon ElastiCache endpoints

The road to self-discovery – Step 2 ( repeat )

The road to self-discovery – Step 3B

aws cloudformation list-stack-resources –stack-name receiptservice-prod-87287ASD0

• S3 buckets

• DynamoDB tables

• SQS

• RDS* databases

• KMS keys

What about credentials

IAM

What about after the application is up?

A GOOD day in production

A BAD day in production

Instances down?!

NO SSH!

Keep Calm

And Turn Debug On

Keep calm and turn debug on

Production monitoring – Keeping your cool

All logs are immediately shipped off of the box

• Logstash, ELK, Splunk, etc

• Writing directly to Amazon CloudWatch Logs and subscriptions

• http://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/Subs

criptions.html

Production monitoring – Keeping your cool

Proactive monitoring

• CloudWatch metrics

• Leveraging APM solutions such as NewRelic, AppDynamics, etc

• Advanced health checks• SpringBoot ACTUATOR

– Health

– Metrics

– Service information

– Thread dumps

– Environment

Other implications on development

Instances must be ephemeral

Fits the microservices paradigm

• No application state written to disk

• Key for automatic scaling

• Cheap to manufacture ( CloudFormation templates )

What happens when….?

I REALLY need access to the disk for forensics, etc.?

• No change from existing best practice

• Snapshot volume and connect to forensics EC2 instance

I need to do a thread dump?

• Standardized logging on startup/shutdown sequences

Other Implications on development

Securing code pipelines

All changes are versioned

• All ability to deploy changes are managed through IAM roles

• AWS CloudTrail auditing

Source code is sanitized

• Clean package dependencies

• OWASP dependency check

Static analysis

• Parasoft, Fortify, Veracode, etc

Break glass in case of emergency?

Ask 2 questions Instead

How are we going to get changes into the pipeline?

How are we going to automatically get the data we need off the box?

How many times have we had to log in?

0

2 years

Thank you!

Remember to complete

your evaluations!