Ops for NoOpsOperational challenges for serverless apps
Eric Windisch CTO IOpipe, Inc.
ERIC WINDISCH
@ewindisch
Founder & CTO of IOpipe, Inc. www.iopipe.com
ex-Docker, ex-Cloudscaling.
Builder of clouds,destroyer of monoliths.
EVOLUTION CREATES CHALLENGES
➤ Fear, uncertainty, and doubt for new users:
➤ What problems will I run into with this new platform?
➤ What will I do when those problems happen?
➤ Will I know about those problems when they happen?
➤ Is it secure?
➤ What tools to use?
SERVERLESS DEVELOPER PROFILES
➤ Frameworks: SLS, Zappa, Apex, DIY, others.
➤ Event sources: API Gateway, SNS, S3, Kinesis, others. (Alexa and AWS IoT sources are relatively infrequent)
➤ Languages: Node, Python, Java, Go, C, Ruby.
➤ Regions: all the regions: us-east, us-west, etc. several moving to new international regions (Sydney, etc.)
➤ Events: 0-100m+ events per day
➤ Stage: dev/test through production
CLOUDWATCH➤ Basic “super-outside” metrics:
➤ Errors ➤ Logs ➤ Invocations/time ➤ Duration ➤ Memory
➤ This is what Datadog, Sumologic, etc. ingest.
HARD PROBLEMS➤ Cold-starts
➤ Especially painful for Java users. ➤ Relationship of metrics vs logs. ➤ Lack or difficulty of profiling &
tracing tools. When do GCs happen?
➤ Retries - why/when & in relation to event sources
➤ AWS account level limits (& when to bump them up)
➤ Difficulty of managing unsupported languages: C, C++, Go, Ruby, etc.
➤ Debugging of & visibility into distributed systems ➤ Are failures at event-source or
lambda function? ➤ Kinesis!!!
➤ Cross-invocation leaks ➤ Memory leaks ➤ File descriptor leaks ➤ Backend process visibility ➤ Thread/callback leaks. ➤ etc.
➤ We install into your process, around your functions.
➤ Import a library, use a decorator (or low-level reporting API)
➤ Gets info via NodeJS process var, Python sys, etc.
➤ Timing information for wrapped function(s).
➤ Stacktrace reporting.
➤ Extra logging / events pushed by developers.
➤ & looks outside…
INSIDE THE PROCESS
METRICS & ANALYTICS
INTO THE BLACK BOX
GITHUB.COM/IOPIPE/LAMBDA-SHELL
OUTSIDE THE FUNCTION - INSIDE THE BLACK BOX
➤ Reuse of containers and VMs
➤ Cold-starts by VM, container, and app process.
➤ Tenancy of VMs (how many containers)
➤ Host VM processes(!!) & processes in other containers(!!!)
➤ Limited & very likely to go away…probably per-tenent VMs anyway
➤ Spawned processes
SECURITY
➤ I founded the Docker Security Team…
➤ FYI - Lambda’s not Docker!
➤ Lambda’s not perfect! (Security never is!)
➤ Amazon did a good job.
➤ Re-inventing the wheel means repeating some mistakes solved elsewhere…
➤ Still… AWS did a pretty good job.
➤ Don’t worry about it.
➤ Some questions can only be answered by AWS or with more data! TBD!
APP MANAGEMENT
➤ Actionable metrics from inside & outside the function.
➤ Ingest CloudTrail for context-aware intelligence.
➤ Where events originate, retries, etc.
➤ Alarms -> Lambda invocation
➤ triggers AWS services, PagerDuty, IFTTT, Zapier, etc.
➤ Real-time visibility. Daily, Weekly, Monthly reporting.
GETTING HELP➤ Gitter…
➤ https://gitter.im/serverless/serverless
➤ Slack…
➤ https://serverless-forum.slack.com/signup
➤ IOpipe Slack (for registered users!)
➤ Forums…
➤ Amazon - https://forums.aws.amazon.com/index.jspa
Eric Windisch CTO IOpipe, Inc.
Register for FREE beta access:
www.iopipe.com
Q&A
Top Related