Quality Data Management - PNSQCQuality Data Management (QDM) is the art of bringing insights and...

Excerpt from PNSQC 2014 Proceedings PNSQC.ORG Copies may not be made or distributed for commercial use

Quality Data Management Christopher W H Davis, Nike Inc.

em: [email protected]; tw: @techchrisdavis

Abstract The process of testing code ends up generating a lot of data, but rarely does this data come together in a consistent and insightful way. Normally teams need to focus on making sure immediate deliverables are high quality and they don't have time to figure out metrics by distilling multiple sources. To truly make this data fit into today's agile world the ability to make insightful data the product of a team’s work is crucial.

Quality Data Management (QDM) is the art of bringing insights and metrics from data throughout the Quality Management (QM) process back to scrum teams & business owners to show the true picture of quality for the entire Software Development Life Cycle (SDLC). Data points include task management, Kanban/Scrum boards, static code analysis, and automated functional test runs. Making sense of this diverse set of data ends up brining unstructured data analysis back to the QM team.

This session will be to discuss the opportunity of QDM, how to associate these disparate data points, and putting it all together in your automation pipeline.

Biography After over 15 years of experience leading and working on software engineering teams in the finance, travel, and healthcare industry, Chris Davis led the team that delivered the platform behind Nike+, the Nike+ FuelBand, and Nike+ Running. In his current role he is leading the effort to bring world-class QM Automation across all of Nike's consumer digital teams, ensuring quality in and through engineering.

Copyright Chris Davis 6/15/2014


1 Introduction In today’s software delivery models Continuous Integration (CI) is a necessary part of a delivery. Many teams are trying to evolve their CI into Continuous Deployment (CD); which, in a nutshell, is a combination of more advanced building & deployment tools that incorporate integrated automated testing to facilitate faster and more automated delivery. To deliver software faster, the automation of quality checks is critical to be able to keep up with the demands of product owners and consumers in today’s rapidly changing technology environments. Consumers expect to see constant upgrades and improvements in software products and without automating much of your delivery it becomes increasingly harder for teams to keep up with demands.

Using today’s CI & CD systems throughout the Software Development Lifecycle (SDLC) teams end up generating lots of data that when combined together can give the team great insight into the quality of their processes and products. However because of the distributed and decoupled nature of these systems, as well as the plethora of products that help facilitate all the different phases of the SDLC there isn’t a standard for pulling all of this data together or combining it in a meaningful way. This is a significant problem as teams transition to CD, since one of the keys for successful CD is automatically determining the quality of software changes based on the data you have .

Enter Quality Data Management, the ability to manage the data generated through your CI pipeline to determine what the characteristics of code changes are, and if they are good or bad in the context of what you’re developing. To understand what we need to do to manage this data the best place to start is by grouping the different systems in the SDLC to figure out what kind of data we can get.

1.1 System Components Within a Typical SDLC

The following diagram shows the different phases of the SDLC and what they do:

Figure 1: The Different Systems Typically Used Throughout the SDLC

All of these system types have valuable data, that when used together, can help determine the risk of software updates and also objectively determine if a change should be released, or should get sent back to development. Many of the systems in the SDLC can show trends about it’s own data but typically can’t aggregate all of the data across the complete workflow of software development. To manage all of this data and use it in concert for quality purposes we need a separate system that can aggregate and analyze everything we can get from these source systems.

As with any system you’re about to build or buy we need to figure out the key requirements.

1.2 Feature Requirements of a QDM System

A system that can give us the level of insight into the software we’re analyzing we need to adhere to the following set of requirements.

• Automate data collection from all points in the SDLC • Allow users to manually tie data together that isn’t initially automated • Store data in a way where it can easily be tied together

Project Tracking

Source Control

Continuous Integration

Deployment Tools

Application Monitoring

Manage tasks & bugs

Manage code & collaboration

Generate builds, & run tests

Move code across environments

Ensure everything is working


• Expose data in a way where it is easily accessed and aggregated for interfaces • Easily plug in new data sources

By automatically collecting data about the changes you’re making to your system, you arm yourself with enough information to accurately point to problems and quantify software quality. Continuous Integration (CI) & Continuous Delivery (CD) systems are automated, so they are perfect candidates from which to automatically collect data. Manually doing this work is cumbersome and costly.

Once you have data you need to analyze it to determine if it’s useful or not. Ideally your system can show data in a flexible enough way where users can see trends in graphs and charts to help identify trends. Once we have the ability to see trends we can determine what is good or bad, and ideally use some kind of machine learning to be on the lookout for them later.

Finally systems and technology change. You need to think ahead and assume that any of your systems can get swapped out at any time. From that perspective it should be very easy to plug in new systems or data streams any time there’s a new source that could add value to our quality analysis.

2 The Data You’re Looking For Some data points to quality in an obvious way, for example, if certain rules are broken or tests fail, you can point to specific things you need to fix to get better quality. However, other things aren’t as obvious, for example, code churn or bug counts of non-blocking severities may point to problems with your requirements or development processes rather than the software product itself. In such cases, it’s important to look for patterns and collaborate with the team on the larger meaning before jumping to conclusions. Let’s take a more detailed look at the systems you need to get data from and what you’re looking for from these systems.

2.1 Project Tracking Systems

Project tracking systems like JIRA or VersionOne can give you information about what was changed and the amount of churn changes went through in development. This data tends to be relative to the team but can be valuable in assessing the risk of change. For example, in some cases a lot of churn on a particular issue or subsystem can point out poorly understood requirements, changes to requirements, or particular components of your system under development that may require further review or indicate higher risk. Conversely lots of churn could be due to the way the team communicates, and if there is a lot then may be that indicates everyone is participating and there is great understanding. The fact that the same trend can mean something completely different based on the team, makes finding issues from project tracking systems a bit of a gray area. Here, we can go back to the requirement of needing a manual interface with which to identify trends as positive or negative based on how a particular team operates.

2.2 Source Control

Source control is where your development team is actively collaborating and changing the product. Since this is where code changes are happening constantly this is a great place to get data on how frequently certain parts of your codebase undergo changes, also known as code churn. On teams using a feature branch workflow where pull requests are required to get changes to consumers you can get data around code churn and code reviews. If a specific part of your codebase is undergoing a lot of churn, this is usually a good indicator that code is going to have a higher risk of malfunctioning than more stable code. Additionally, looking at source control usage as a quality metric can help make the code review process more visible and drive better accountability for individual changes. As churn in project tracking systems pointed to the clarity of requirements, code churn can point to problem spots in specific places in your code that may pose more risk than others.


2.3 Static Code Analysis

Using tools like Sonar static code analysis is a simple way to implement basic rule checking into your automated QA process. Static code analysis will check your code against basic rules to make sure you’re following best practices and that you’re not making any common mistakes like forgetting to handle errors properly or putting variables in the wrong scope. Static analysis can also help you find potential problem spots in your code that may have high cyclomatic complexity or cyclical dependencies. These types of mistakes become more common when a team is put under heavy pressure to deliver. Having a system that can report out rule violations is very useful to ensure that short cuts don’t become the norm when deadlines are looming.

Depending on the age of your codebase static code analysis should be used in different ways. For projects that have the opportunity to start from scratch with new code, it’s easy to set hard gates and thresholds for the team to follow. For software that’s been around for a while it’s not so easy; as you likely have many code violations that would take a significant amount of time to track down and fix. In this case, you likely want to analyze new changes to assure that at least code does not get worse, or set thresholds to be able to tackle tech debt as you’re delivering new features.

2.4 Automated Functional Tests & Production Monitoring

Automated functional tests tell you if your software does what it’s supposed to do from the customer’s perspective. Realistically these tests should never fail, and if they do a release would likely be blocked unless there are business reasons not to. Whenever you are running automated functional tests your code should be deployed; ideally on a system that replicates your production environment. As these tests run you should be capturing results from the same tools you use to do your production monitoring. You could have 100% of your functional tests pass, but if they generate 1000 errors in your logs with stack traces what the heck are you doing? I have seen this happen when teams swallow errors instead of handling them properly, or when front-end code compensates for a poorly designed or poorly functioning backend by working around its inconsistencies. This is especially common when teams have a large number of functional tests that only test the end result and don’t pay attention to what the internal of the system is doing. Ensuring that you’re looking at the details of how your system is operating while it’s under functional test is a key to getting a quality product out the door.

2.5 Combining Data Together

All of these systems on their own tell you lots of good information about the quality of a particular phase of the SDLC, but it’s the aggregate of all these systems that truly gives you the full picture of how a team is operating and if the software you’re delivering meets the quality requirements of the organization.

Project tracking + Source control = how well your team is communicating & working together.

Static analysis + production monitoring = tell you overall quality of your software.

A team that is functioning at a high level will typically produces better quality, so it’s important to look at the combinations of data to get the full picture

3 The Quality Data Management (QDM) System At a high level the concepts we’re putting together aren’t new, in fact they are a common way to work with large amounts of unstructured data in today’s web based world. With the large amount of data that web based systems collect today, management of that data is crucial for the success of these systems. Similarly you can collect large amounts of unstructured data from the various elements of the systems you’re using to deliver software and all of those systems can give you insights into the quality of your code and your process. The logical conclusion is to have a system to manage the data can be used in the measurement of that quality; thus a Quality Data Management (QDM) system.


Figure 2 illustrates the types of systems you want to access, and the general flow of a QDM system.

Figure 2 – The High Level System Components

On the left side are the systems you want to collect data from, on the right side is what you’re going to do with that data.

3.1 Getting Data

Once you’ve identified all the systems in your SDLC you need to get that data out through some kind of Extract Transform & Load (ETL) process. That will get the data you need a database, since you’re combining several different types of data from several different systems a NoSQL document store will be the easiest to work with.

The best course of action is to use tools that your company already supports and knows for the lowest barrier of entry for data analysis and aggregation. However, if you don’t have anything you can easily tap into, a simple recommendation for a database would be MongoDB. MongoDB has tools built in to do map reduce functionalities to be able to distill your data easily. It also manages unstructured data very well, is very popular, and even comes preinstalled with some of the more popular versions of Linux.

The easiest way to get data into your database would be to use simple scripts to extract key elements from your target systems and get them into your database. Some popular scripting technologies that make this straightforward are python or groovy. Both have libraries that can be used to get data easily over HTTP and both have libraries to connect to MongoDB fairly seamlessly.

3.2 Dashboards & Radiators

Once you have the data you need to display it somehow. Creating dashboards and radiators that are capable of displaying the right data to the right audience is important to validate your quality process. For example, high-level managers likely only care if builds are good enough for production, developers need enough information to find and fix problems with minimal triage, and project managers likely want bug counts and estimated time to fix any issues. The most accessible way to publish the data you need in a flexible way is probably to create a web application that can sit on top of the data you’re collecting and display it how you need it.

Putting together a web app is very simple with today’s frameworks. If you’re using Python or Groovy as mentioned in the last section you’re in luck; both have web-scaffolding frameworks (Django for Python and Grails for Groovy) that make displaying the data you’re collecting visible over the web relatively easy.

Database

Rules Engine

Performance & Process

Build Reports & Logs

Consumer Perception

Extract Transform Load (ETL)

Report Dashboards

Bug Reporting


There are also plenty of open source dashboards like Duckboards and Graphite that you can pipe data into. An example of a dashboard that gives you some insight into how your team is working could combine elements of your source control with elements of your product tracking as shown in Figure 3.

Figure 3 – Comparison of Source Control and Project Tracking Data

In this example, we have some key statistics from source control on one graph and key statistics from project tracking on another. The really interesting trend here is the amount of work done doesn’t add up to the amount of tasks completed. Notice how the number of tasks is roughly parallel to the number of bugs and not the amount of story points completed. Story Points are supposed to indicate the amount of effort that goes into a sprint, but in this case they don’t indicate effort at all. For this team, it seems as if the obvious indicator of effort is source control usage followed by the number of bugs completed. If that’s the case then the team needs to re-think how they estimate and plan work. This is a great example of how managing all the data from your project lifecycle can help inform not only quality of your software, but also the quality of the operation of your team.

3.3 Data Flow

The QDM system itself needs to be non-intrusive and parallel to the build process, but shouldn’t slow it down. Ideally it will get data as code is checked in and build, tests run, and it will act as a gate in the process to stop code from moving where it shouldn’t. Figure 4 illustrates the typical grouping of tests in a CI pipeline that will give the system the data it needs to make a very informed decision on whether code should be deployed to the next stage, or if it should get sent back for some rework.

Source control useis trending up, yet there are large dipsin the tasks complete


Figure 4– Typical Groupings of Tests in a CI Pipeline

Static analysis will check for code standards, rules, and unit test coverage. This can be used as a hard gate to prevent code without unit tests from slipping through and catch violations before they turn into functional bugs.

The auto-flow is the combination of your functional tests and your production monitoring as mentioned in section 2.4 and shown in Figure 4. In this phase, you can also set hard gates around the data you’re getting from your production monitors and the results of your automated functional tests. If you can manage to design your functional tests so that they are also used for performance then that’s awesome. If not then you likely need another phase in your build pipeline to get the performance characteristics with enough detail to come up with a quality assessment.

If you have a system where you are actually testing in an environment that has a significant amount of consumers on social media then you can harness that to find out what your consumers are saying about your changes in real time.

The final and most interesting piece of the pipeline is a Machine Learning (ML) component, which can help you identify patterns that you may not be able to identify manually. This is another technology that is becoming more popular and can be integrated with the technologies we’ve already -discussed with minimal effort using several open source libraries available today. Once you start to find patterns that your team determines are important, you can train an ML component in your system by indicating what series of outputs you think are important and the ML component will then try to identify similar patterns in new data.

4 Incorporating QDM Into Your Team The first step is to build a system that can do all this magic for you using modern tools and frameworks that are not hard to deploy. In fact, the easiest problems to solve are technical problems, the hardest problems are behavioral ones. The most important change you need to make on your team to adopt QDM is that everyone must agree to fix problems as soon as the system makes them visible.

Results Database

Level 1

ML Pattern Finder

Distribute Results

Static Analysis

Small & Medium

Tests

Level 2

Medium & Large Tests

Performance Profling

Canary

Real Traffic

Profiling


Once the system is set up, using these metrics becomes easy. There are a few ways you should incorporate them into your process.

The first key to incorporating QDM is “continuously”. It’s no accident that the word “continuous” is everywhere today; continuous integration, continuous delivery, continuous testing, continuous choose-your-noun. The real key to getting better software products out to your consumers as fast as possible is to make everything on your team continuous, including the metrics you use to determine the quality of your software. This also ends up creating a healthier team, since it makes the QM process extremely transparent and obvious. By publishing useful dashboards and pushing this data down to the development team as they’re working, you will put the quality criteria in front of the entire team in real time. The key here is to enforce the rules when things break in real time.

The second is to make the rules you define a part of your delivery process. When the static code analysis rules you have set up aren’t met – break the build. When you start to see spikes in the error messages in your logs during a functional test run – fail the build. Failures need to also be non-negotiable. Remember the first thing you needed to do was to get agreement from the team that as soon as the system shows something is broken you have to fix it. With that in mind, if the system gives you false positives then tweak the system. If the system is giving you good data and you choose to ignore it or minimize it, then you might as well not even bother.

5 Conclusion Using technologies such as map-reduce and machine learning, and harnessing the data that is generated throughout the SDLC you can show objective measures of quality of your software. By separating the components of your software delivery process you can measure several different aspects, each of which can point to the health of your product and your process, and in most cases can be used as hard gates for your automation process and ultimately can get your team closer to continuous delivery. Managing the data generated in your quality process can make quality management a more insightful, valuable, transparent, and automated part of your software delivery.

Quality Data Management - PNSQCQuality Data Management (QDM) is the art of bringing insights and...

Documents

Transcript of Quality Data Management - PNSQCQuality Data Management (QDM) is the art of bringing insights and...