Peer Feedback and Methodological Improvement in Lablet ... · 6/2/2016 · Peer Feedback and...

Peer Feedback and Methodological

Improvement in Lablet Research: A

Qualitative Analysis

Lindsey McGowen, PhD

Lablet Evaluator

Christine Burgh

Research Assistant

North Carolina State University

June 2, 2016

Lablet Goals

• Solve hard problems in cyber security

• Develop and conduct methodologically rigorous research

• Develop a community of practice for SoS

Methodology Support Activities

• IRN-SoS

• SoS methodology benchmarking study

• Methodology consulting

• Methodology guidelines

• Methodology feedback seminars

Methodology Feedback Seminars

• Format:

– Students present their research

– Lablet researchers provide feedback

– Feedback is recorded via an online

– Presenters receive a feedback report

Methodology Feedback Seminars

• Seminar format piloted in Spring 2014

– Adjustments made each semester based on

participant feedback

• Seminars held Fall 2014 – Spring 2016

– 25 seminars

– 31 research presentations

– 33 presenters

Methodology Feedback Seminar Impact

• Presenters report

they used to improve

their research

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

Feedback Used

Yes No

Methodology Feedback Seminar Impact

• Presenters report

changes were made

primarily to the study

Intro & Background

and Methodology

0%

20%

40%

60%

80%

100%

Changes Made

Intro & Background Methodology

Analysis & results Conclusions

Language & Style

Methodology Feedback as Data

• A rising tide raises all boats?

• Feedback reports used as evidence to investigate

whether lablet research is becoming more

methodologically rigorous over time

• Why qualitative coding

– Captures the perspectives if Lablet researchers

– Experts in cyber security research

– Enhanced validity of conclusions about the

methodological rigor of research

Research Questions

• RQ1: On which aspects of the research process does

Lablet methodology feedback reflect a need for

improvement?

• RQ2: Does the feedback provided in Lablet methodology

seminars change over time?

• H1: Feedback from early semesters will focus on more

fundamental aspects of the research process.

• H2: Feedback from later semesters will focus on deeper

and more nuanced aspects of the research process.

Qualitative Methodology

• Feedback reports entered into a database

• Data fields based on methodology guideline

components:

– Abstract/Overview

– Background/Lit Review

– Research Goal

– Methodology

– Analysis/Results

– Conclusions

– Language/Style

– Feedback from

respondents N =

102


• Upload Data to qualitative coding software: Dedoose


• Qualitative Coding:

– Process of identifying meaning within the data and

creating descriptive labels or “codes” that can be

applied to the text

– Codes grouped into topics and themes

– Allows voluminous text to be segmented and

categorized into manageable units

– Allows for comparison and analysis across diverse

research projects

– Codes based on methodology guidelines and content

analysis


• Code Development

Process

– Select 50% of the data

– 2 coders

independently identify

themes within the data

and develop codes

– Compare codes

– Combine similar codes

– Reconcile differences

to arrive at a common

set of codes

– Develop code

definitions

– Iterative process• 4 iterations so far…

• Coding for Methodology

entries complete

• Coding for other aspects

of the research on-going

– Apply codes to 100%

of data

– Calculate inter-rater

reliability

Inter-Rater Reliability

• Cohen’s kappa statistic—Cohen (1960), ‘A coefficient of

agreement for nominal scales.’

– Widely used and respected measure to evaluate

inter-rater agreement as compared to the rate of

agreement expected by chance—based on the

coding behavior or each rater.

– Values above .80 are considered to be excellent

agreement (Cicchetti,1994; Fleiss, 1971; Landis and Koch,1977; Miles and Huberman, 1994)

• Cohen’s kappa for “Methodology” entries = .91

Codebook

• Coded “Methodology” entries, but responses spanned

wide range of research process aspects

• 3 themes

– Positive Codes: good, clear

– Improve Codes: specify, clarify, justify, improve

– Specific suggestions

• 12 research topics

• 53 sub-topics

• 206 unique codes

Codebook

Coding Examples: Background

• Ex: Clarify contribution to the field

– “try to characterize what new information you are

trying to achieve. High level characteristics of

vulnerabilities in [aspect of a cyber system]. You can

then create an artifact to automatically triage [aspect

of cyber system]. Can expand the same ideas to VCL,

EC2, etc.”

– “How is your framework different from these prior

approaches and how does it compare”

Coding Examples: Threat Model

• Ex: Specify threat model

– “The authors should determine what the threat model

is and determine adversarial models. This is an

important step in evaluating the effectiveness of the

proposed approach. Note that it is reasonable to not

evaluate an adversarial model at first, but it eventually

needs to be studied.”

• Ex: Clarify threat model

– “Improved clarity of the threats models faced in this

approach would help.”

Coding Examples: Research Goal

• Ex: Clarify research goal

– “You talked about the research goal, not clear if

preventing malicious automation or uniquely ID a

user”

• Ex: Clear research questions

– “Good to explicitly state research question”

– “Good identification of research question”

Coding Examples

IRB

• Ex: Clear IRB

– “Good to mention IRB”

Methodology

• Ex: Clarify metrics

– “Needs clear statement of metrics”

– “question about quality metric – is there any

subjectivity? Question about effort metric – concern

about using likert scale to multiply”

Coding Examples: Methodology

• Ex: Suggest within-subjects design

– “Use everyone as their own reference, and control for

individual differences that way. Have people do at

least 2 scenarios”

• Ex: Justify data source

– “the presentation could have better motivated the

value of using [app development site] data”

– “some more discussion may be needed to talk about

potential biases introduced by sampling through

Mturk”

Coding Examples: Methodology

• Ex: provide logical justification

– “Use first order logic”

• Ex: Clarify experimental condition

– “What is the experiment environment? Needs to be

described in detail”

– “This is a variant of the prisoners dilemma challenge,

can you talk more about that?”

– “Are you using these 2 scenarios for your study?”

Coding Examples: Analysis

• Ex: Clarify analysis steps

– “Clarify the steps you take in your analysis.”

• Ex: Specify analysis steps

– What tests are you running?

• Ex: Good success criteria

– “Nice game setup and methodology, success criteria

and metrics”

• Ex: Add statistical control variables

– “learnability is confounded with reading speed - either

drop slide time or measure reading speed for each

participant and statistically control for that. “

Coding Examples: Conclusions

• Ex: Specify next steps

– “I understand how the completed experiments were

performed, but I don't think we had enough time to

discuss future work. I'm curious how future work will

bring the experiments closer to practical user

authentication.”

– “Ideas for future work - There was a great discussion

of how the overlay filesystem used by [system

platform] may be able to be leveraged to

automatically push security updates to children

images.”

Coding Examples: Resource

• Ex: Suggested resource

– “Is anyone aware if anyone has studied to see if

people are desensitized to security threats b/c of it's

news coverage. Dr. Statton was doing a sentiment

analysis, that's the closest I've heard.”

– “I'd definitely consider citing or at least considering

some of the work exploring the limitations of MTurk

participants. Here's links to a couple...”

Coding Examples: Validity

• Ex: improve external validity

– “Game shows the probability of the next threat type,

but in the real world you wouldn’t have that

information”

– “best practices are only a starting place. If you talk to

someone running the security operations for an

enterprise, they will tell you that the best practices are

necessary, but not sufficient They will want to go

much beyond best practices and customize a solution

to their environment”

Coding Examples: Validity

• Ex: Specify threats to validity

– “what are the treats to validity of your research? must

be some threats, due to unreported vulnerabilities, or

increase in programmer skill that could account for

findings. any alternative explanations for what you

observed?”

– “Once additional piece of context, you always did this

at the end of the semester, sometime in the middle of

the semester we do a whole section of security

requirements… I wonder if [we] are somehow

confounding your results by teaching the class…”

Coding Examples: Language & Style

• Ex: Good graphics

– “nice graph to describe methodology/process of data

collection and measurements”

• Ex: Use precise language

– “use similar/same language so that reader is not

confused.”

– “when describing which systems you will look at, refer

to a specific sampling procedure. if you have

selection criteria (such as availability of data, etc)

refer to them as selection criteria.”

So is there any evidence from the feedback that Lablet

research is becoming more methodologically rigorous over

time?

Preliminary Results: Themes

Fall2014 Spring2015 Fall2015 Spring2016 Total

Positive 13% 13% 9% 15% 14%

Improve 78% 77% 83% 74% 77%

Suggestion 9% 10% 8% 11% 10%

TOTAL 100% 100% 100% 100% 100%

No clear patterns in the data so far…

Preliminary Results: Research Topic

Emerging patterns in the data?

Fall2014 Spring2015 Fall2015 Spring2016 TotalBackground 8.7% 7.1% 14.5% 4.7% 7.4%Threat Model 0.0% 0.0% 1.8% 0.6% 0.4%Research Goal 13.0% 13.1% 14.5% 10.6% 12.3%IRB 0.0% 2.4% 0.0% 1.2% 1.2%Methodology 52.0% 42.8% 34.5% 41.8% 43.1%Analysis 13.0% 7.1% 7.3% 8.8% 8.8%Conclusions 0.0% 1.2% 1.8% 1.2% 1.0%Resource 0.0% 2.4% 1.8% 3.5% 2.3%Validity 8.7% 13.1% 20.0% 22.9% 16.7%Language and Style 4.3% 7.1% 1.8% 4.1% 4.9%Study Description 0.0% 2.4% 3.6% 1.2% 1.7%Unintelligable due to Typo 0.0% 1.2% 0.0% 0.0% 0.4%Total 100.0% 100.0% 100.0% 100.0% 100.0%

Some Observations

• Still need to code entries for several more research

aspects

– Abstract/Overview, Background/Lit Review, Research Goal,

Analysis/Results, Conclusions, Language/Style

• Data entry fields do not appear to correspond to the type

of feedback entered in those fields

• Data skewed toward Spring 2016: 45% of all entries,

36% of all codes

– Much better attendance: Seminar became a course

and faculty scheduled their students

Next steps

• Migrate to NVivo software

– Dedoose is glithcy

• Code additional data entry fields

• Further refine codebook

– Would like to have CS coder(s)

• Analyze data for changes in feedback over time

QUESTIONS?

Peer Feedback and Methodological Improvement in Lablet ... · 6/2/2016 · Peer Feedback and...

Documents

Transcript of Peer Feedback and Methodological Improvement in Lablet ... · 6/2/2016 · Peer Feedback and...