Tina Fletcher| Targeting Quality| Oct 9, 2015 Why Getting Good Test Data is Hard, and What to Do...
-
Upload
sabina-melton -
Category
Documents
-
view
213 -
download
1
Transcript of Tina Fletcher| Targeting Quality| Oct 9, 2015 Why Getting Good Test Data is Hard, and What to Do...
Tina Fletcher| Targeting Quality| Oct 9, 2015
Why Getting Good Test Data is Hard, and What to
Do About It
Currently a Test Strategist; have also been a Test Lead, Team Lead, and Test Specialist
But really… someone who cares about doing thoughtful, responsible testing that (hopefully) helps lead to better quality releases
I’m also [email protected] and @fletchertinam
Who am I?
We don’t talk about it enough– Not glamorous avoided
– Sounds easy underestimated
– Not “testing” forgotten
In my experience:– Not glamorous necessary prerequisite
– Sounds easy surprisingly challenging
– Not “testing” forgetting is expensive
Why am I talking about test data?
Software Testing
Test Data
Have made enough mistakes to start offering advice?
What the rest of this talk will look like:– 10 useful questions to ask when starting a new software project
Example applications to help solidify concepts:– Media player
– Email client
– Learning management system for online courses
Why am I talking about test data? (…cont.)
10 Useful Test-Data Related Questions to Ask When Starting a New Software Project
General definition– Research shows: “test data is the data you test with”
– Tina’s best attempt: “something a user might have, enter, or do while using your feature that you care about testing before release”
1. What is test data, in our context?
For your project– Define a test data unit
Media files, emails, online courses, students Rows of database entries, JSON objects
– Examples help to identify scope and variables <1 sec 128kbps mp3 file Email sent to 2 recipients, 3 image attachments, HTML table in body, AS 12.1 provider Course with >1500 students, 6 quizzes and heavy group discussion usage User registered as a student in 3 courses and a teaching assistant in another
1. What is test data, in our context? (…cont.)
General reasons– Need test data in order to execute tests
– Test under conditions that resemble production
– Useful for many team members (developer, tester, UX designer, product owner)
2. Why does test data matter to us?
Project-specific examples– We want to be confident that users can play all common media file types without
encountering any errors
– We want to be confident that users will be able to read and interact with any email message they receive, regardless of its properties
– We want to be confident that the online learning system’s performance is not impacted by large classes with many concurrent user sessions
– We want to be confident that permissions in the online learning system are properly restricted according to user roles
2. Why does test data matter to us? (…cont.)
Random vs controlled– Mailing list subscription or crafting specific emails
Realistic vs simplified– Testing combinations of things or one at a time
Static vs changing– Same set of files or rotating/expanding collection
Able to tweak on demand as interesting things are observed– Something weird happens after ~2 minutes of playback
3. What should our test data look like?
Understand functionality and performance expectations– Support claims for media file types
– Load time before playback starts (large files)
Prioritize risk areas and how things might fail– Emails that have been forwarded across multiple providers
Decide what is in and out of scope– Top 25 email providers
Understand your customers– Common configurations of permission settings
3. What should our test data look like? (…cont.)
Complete test data management suites (generation, analysis, cleaning):– IBM, Informatica, Grid Tools
Copy from existing source:– Test data repository
– Production
Lightweight/freeware (rows of data):– http://www.generatedata.com, http://www.randomuser.me, http://www.mockaroo.com
Mock data (APIs):– www.getsandbox.com, www.mockable.io
4. How can we get test data?
Use your team/company to generate data (“dogfooding”)– Internal training courses
Build your own tool to generate data– Email spammer (volume)
Use automated tests to generate data– Simulate student activities in a course
Generate it manually– Sometimes the only way, or most realistic way
4. How can we get test data? (…cont.)
Using multiple methods is usually best– More variation
– Each has strengths/limitations
4. How can we get test data? (…cont.)
Inconsistencies in generated data
Large number of variations to obtain
Copyright
Privacy of “real” data– Email sharing
– Masking – reduced usefulness?
5. What test data challenges might we encounter?
“Old” data– Database migration
– Time stamps
Cost– Paid accounts
– Paid tools
– Hosting multiple test environments
– Time!! …
5. What test data challenges might we encounter? (…cont)
Copy from a central location– Security/privacy concerns
– Ability to partition data on demand
– DRM restrictions
Stored in shared accounts or test environments– Return to known state / avoid irreversible change
– Conflicts during simultaneous usage
Back up your test data!
6. Where are we going to keep test data?7. How are we going to move/share test data?
Are you finding bugs?– Are they valid? Important?
Are people outside of your team finding bugs?– e.g. beta, sales, other dev teams?
Are you able to test any scenarios that you randomly think of during testing?– Note: save/share additions and improvements to the data set
8. How will we know if our test data is effective?
“Full coverage” is a myth; important to understand what’s missing
Ideas for mitigating actions:– Beta or crowd source testing
– More unit tests
– Review usage patterns (priority)
– Communicate New ideas from unexpected sources Prepare: support, ops
9. What test data were we not able to get?
Future you, or other teams, probably need to make changes in future– Avoid time cost for “reinventing the wheel”
Documentation:– What you have
– What you don’t have
– Where to find it
– How to use it
– How to add/tweak data
10. What test data legacy will we leave behind?
Post mortem notes:– How you arrived at the current set of test data (and why)
– Key things that need to be repeated
– Ideas that didn’t pan out
– Things that could have gone better
– Things you didn’t get to
– Ideas and recommendations for future
10. What test data legacy will we leave behind? (…cont.)
What other questions would you add to this list?
What challenges have you encountered related to test data?
What creative test data solutions are you proud of?
What did I miss?
… or anything else!
Over to you…
Why is getting good test data hard?– Lots of variables and data characteristics to consider
– Choosing sources and generation methods
– Balancing data realism, privacy, $$, time
– Can also be difficult to move, store, share, and tweak
What can you do about it?– Plan ahead
– Ask yourself the 10 questions
– Keep improving
Conclusion
Thank you!