November 12, 2019 Robert Douglass, Chief DevRel Officer€¦ · - Vagrant or any similar solution...
Transcript of November 12, 2019 Robert Douglass, Chief DevRel Officer€¦ · - Vagrant or any similar solution...
Robert Douglass, Chief DevRel Officer
November 12, 2019
» Development process
» Coding standards
» Code deployment
» Infrastructure
» Updates & maintenance
» Data
» Access
» Monitoring
- Onboarding / offboarding developers
- Developer environments
- Stakeholder engagement
- Make them install runtimes and services themselves
- Give them root access to production
- Have them download code- They can develop on all the sites
- Give them access to a shared development server
- Checkout code from Subversion- They have to be added to each site
manually
- Provide docker images- Provide a dedicated development
server for each developer- Give them Jira access for
triggering deployment- Run a Jenkins task to add them to
sites they should be able to develop on
- Add them to the Github or Gitlab organization
- Clone production to local- Development environment for each
branch- Control which sites they can
develop on by adding them to the right team
- Developers just use whatever is on their laptop. Nobody knows if the PHP version is the same, if configs are the same.
- Shared staging server. If there's one shared staging server where everyone dumps their code for testing… big problems.
- Use same environment for all sites
- Vagrant or any similar solution for local.
- Ops manually builds cloud environments for the team
- Someone builds and provides a Docker container for local
- Same Docker container can run in a container grid like Kubernetes
- Use different Docker containers for each site
- A fully automated system that reproduces the cloud environment locally, like Lando. https://lando.dev/
- A copy of the production infrastructure for every testing operation, with exact services, configuration, and data.
- Quickly switch locally, or in the cloud between distinct environments for each of the 1,000 sites.
- Show the client on production
- Show the client on the shared staging server
- Send the client a URL that demonstrates a specific feature on a specific site.
- Have URLs to send for every feature for every developer, in isolation
- Have URLs to send for every feature for every site
- 3rd party libraries
- Code quality & Testing
- Developers download libraries directly into the codebase
- Developers use a build system like Composer and check the artefacts into Git.
- Developers check the Composer.json and Composer.lock into Git, and the system applies a unified build pipeline across all developer projects.
- Anything goes. If the application "works", code gets deployed
- Unit testing can block a deploy- Code linting can block a deploy- Use of blacklisted code can block a
deploy- Hooks exist for
- regression testing- integration testing- performance testing
- Deployment methodology
- Speed and frequency of deployment
- Interruption caused by deployment
- Deploying to a fleet
- Rolling back deployments
- SFTP- USB Sticks- Overwrite existing code
- Git pull on the server
- Git push triggers deployment to a test environment
- Git merge triggers deployment to production
- Old environment is not updated but replaced
- "We update the site twice a year"
- Deployment at the end of each sprint
- "We deploy every day, many times, as soon as a feature passes testing"
- Even when the site is under load- Even on Black Friday (ecommerce)
- We post a "site offline" page when deploying
- Freeze requests during the critical phase when database schemas are being updated
- Customers never notice that we deploy. No downtime, whatsoever.
- Deploy to each site, one at a time
- Drupal multisite: deploy code, then have each database update run sequentially
- No multisite. - Jenkins to automate every step,
provide a list of sites
- Sites pull updates automatically- Target groups of sites for updates- All sites independent and in
parallel
- Once we deploy, the old site is gone
- Updating a server is a one-way street
- There's always the backup….
- We keep the old deployment around (eg symlink, or separate server) in case we're not happy with the deployment
- Use DNS or Loadbalancer to direct traffic
- Codebase specifies not only the application but the infrastructure
- Rolling back code and infrastructure is done with a Git Revert
- Snapshots are taken before deployment and can be restored easily
- Ease of provisioning
- Versioning
- Immutable
- Rollbacks
- Development parity
- Shared or isolated
- Disaster recovery
- Ops ticket for new environments- Collect specifications- Order machines- Ops configures machines, installs
software- You get access, test
- VMs + Ansible / Puppet
- Launch your Docker image
- Define infrastructure in code- Infrastructure adapts automatically
every Git push- Create new projects one-click- Infrastructure scales automatically
- No versioning; someone goes in from time to time to "maintain" the infra.
- Infrastructure obeys a template; Ansible, Puppet, Terraform
- Therefore the version of the infrastructure can be deduced from the template
- Infrastructure is strongly linked to a deployment
- Infrastructure is created for a deployment
- The versioning of the deployment (Git hash) is thus the versioning of the infrastructure
- Process of setting up development differs from process of setting up production in any way
- Developers are left to their own to get the software and services they need
- Container images are prepared using guidelines that approximate parity with production:
- Docker- Lando- Vagrant
- The same build process that builds and deploys production infrastructure also builds development environments (cloud and local)
- Server level access to multiple sites at once
- Sites share infrastructure and occupy same user space
- Multisite- Docroots with no
containerisation- Developers work simultaneously on
a shared environment
- Environments are segregated by permissions, but not physically (RAM / CPU)
- Containerisation provides guarantees about access, CPU, RAM segregation
- Dedicated infrastructure for each site
- Scalability issues - too costly if sites are small.
- Start from scratch, recreate site somewhere new
- Separate backups for various data sources
- Guarantee exact copy of infrastructure in new location
- Redeploy code from Git- Restore latest backups
- Import DB- Move files into place
- Automatic replacement of failed infrastructure
- Data attaches to new services automatically (service discovery)
- If data rollback is needed, can be done with API call
- Can be applied to 1 or 1000 sites with same process
- Underlying capacity guarantees (eg public cloud provider)
- You can change things on the server
- Puppet (or similar) detects changes and reverts them, thus providing some guarantees of state
- Builds are permanent and can't be changed.
- Read-only file systems- Disposable
- Reverting to a previous Git commit reverts infrastructure to that state as well.
- Upstream security
- Auditing the fleet
- Updating the fleet
- Equifax
- Security mailing lists
- Code Vulnerability Monitoring tools:
https://techbeacon.com/app-dev-testing/13-tools-checking-security-risk-open-source-dependencies
- Upstream updates are automatically pulled and prepared for testing on a regular basis
- How many sites run Drupal 8.6?- How many sites run Views module?- How many sites have an old version
of React.js?
- Ask the site what versions it has, eg. Drupal Console
- Audit Composer.lock for 1 or 1000 sites to identify vulnerability status
- Drupal Multisite:- Inconsistent due to DB
updates- All-or-nothing
- Push updates to 1 or more sites;- Sites grouped by business unit or
other similarities
- Sites are self updating - Changes pushed only to upstreams- Depending on how sites are
composed, updates come from upstreams
- Backup consistency
- Backup frequency and retention
- Backup accessibility
- Return-to-operations - did you test that?
- Anything using zip, tar, rsync, or ftp
- All data, including uploaded files, database, search index is duplicated
- disk level snapshots
- Incremental snapshots allow very granular rollback
- Retention is prescribed, and auditable.
- Different data types may require different retention plans.
https://docs.ceph.com/docs/giant/rbd/rbd-snapshot/
- Physical media in the CTO's office
- Tarballs that you have to download
- Snapshots that can be applied to one Region
- Snapshots, controlled by an API, globally applicable
- Did you actually test that?
- Teams & Organizations
- Authorization Granularity
- Central authentication- SSO- Organization: For whom is
Developer X working?- Teams: On which functional team is
X working?- Organization + Team + Role
determines access permissions on 1 or 1000 sites.
- Support for Stakeholder / Customer
- Project Manager- Auditor- Developer- Administrator
- Uptime monitoring
- Application monitoring
- Multiple HTTP level checks, including authenticated, transactional processes
- Monitor API endpoints for availability, performance
- NewRelic- TideWays- Blackfire.io