USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

278
USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL ACTIVITY AND SEDENTARY BEHAVIOR By Alexander Henry Montoye A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of Kinesiology – Doctor of Philosophy 2014

Transcript of USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

Page 1: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING

PHYSICAL ACTIVITY AND SEDENTARY BEHAVIOR

By

Alexander Henry Montoye

A DISSERTATION

Submitted to

Michigan State University

in partial fulfillment of the requirements

for the degree of

Kinesiology – Doctor of Philosophy

2014

Page 2: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

ABSTRACT

USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-

LIVING PHYSICAL ACTIVITY AND SEDENTARY BEHAVIOR

By

Alexander Henry Montoye

Physical activity (PA) and sedentary behavior (SB) are important behavioral variables

that are associated with many key short- and long-term health indices. Objective and highly

accurate methods of measuring PA and SB are needed in order to better understand the

relationships of PA and SB with various health outcomes, determine population levels of PA and

SB, identify and target groups at high risk of having low PA or high SB, and assess the

effectiveness of interventions aimed to increase PA and reduce SB in populations. Of the

available measurement tools, accelerometer-based activity monitors have gained popularity due

to their blend of feasibility for use and relatively high accuracy for assessing PA (by identifying

specific activity types), SB, and energy expenditure (EE). However, little research has been

done to compare the accuracy of accelerometers placed on different parts of the body, and

current data modeling methods are either 1) simple to use but lack accuracy or 2) highly accurate

but highly complex. Therefore, the purpose of this dissertation was 1) to develop accurate and

relatively simple data processing and modeling methods for accelerometer data and 2) to

compare accelerometers located on the right hip, right thigh, and both wrists for classification of

activity type and prediction of SB and EE.

Healthy adults (n=44) were recruited to participate in a 90-minute simulated free-living

protocol. For the protocol, participants performed 14 activities for between 3-10 minutes, with

order, duration, and intensity of activities left up to participants. Participants wore a portable

Page 3: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

metabolic analyzer (for a criterion measure of EE) and four accelerometers, which were placed

on the right hip, right thigh, and both wrists. The order and timing of the activities performed

during the protocol was recorded by a trained research assistant (for a criterion measure of

activity type and SB). Machine learning algorithms (i.e., artificial neural networks) were

created by extracting simple-to-compute features from the data from each of the four

accelerometers in order to classify activity type and predict SB and EE. Accuracy of the four

accelerometers for each outcome variable was assessed by comparing predictions from the

accelerometers to the actual values obtained by the criterion measures. Additionally, we

processed, cleaned, and extracted features of the accelerometer data in Microsoft Excel and

created the artificial neural networks using R software, thereby accomplishing our goal of

using simple methods to create machine learning algorithms to model accelerometer data.

Overall, the thigh accelerometer provided the highest predictive accuracy for EE,

although both the wrists and hip accelerometers also provided highly accurate EE predictions.

For recognition of activity type, the wrist accelerometers achieved the highest accuracy while

the hip accelerometer had the lowest accuracy. Finally, for prediction of SB, the hip and left

wrist accelerometers provided the highest accuracy while the right wrist accelerometer

provided the lowest accuracy.

Our study highlights the strengths and weaknesses of accelerometers placed on the hip,

thigh, and wrists for prediction of activity type, SB, and EE. These findings suggest that single

accelerometers can be used for accurate measurement of PA, SB, and EE, although the optimal

accelerometer placement site will depend on the specific research question. Further research

should be conducted in a true free-living setting with a more diverse population, different sets

of activities, and when using other types of machine learning to mode the accelerometer data.

Page 4: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

Copyright by

ALEXANDER HENRY MONTOYE

2014

Page 5: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

v

I would like to dedicate this dissertation to my grandfather, Henry Montoye. You are a

pioneer in the field of exercise physiology and have had a lasting positive impact on our world

through your work. I feel privileged to get to follow in your footsteps, and I have had the

opportunity to meet so many great scientists in the field due to my connection with you. More

than that, though, you have been a wonderful grandfather. I will never forget all the card

playing, drawings, broken cookies, Great Harvest breads, and Old Country Buffet trips you have

shared with me over the years. You are a role model in how to lead a successful career and be an

involved husband, father, grandfather, and great-grandfather. Thank you.

Page 6: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

vi

ACKNOWLEDGEMENTS

First, I would like to thank my advisor, Dr. Karin A. Pfeiffer, for her guidance and

support in my four years at Michigan State University. You have been incredibly supportive of

the different projects I have undertaken in my doctoral work, even when some of them did not

directly push me toward completing my degree. I would also like to thank my dissertation

committee for their assistance in designing and implementing a project that has established a

solid line of research for me to continue in the future. Second, I want to thank the fellow

doctoral students for making the graduate experience at Michigan State so rewarding. They have

been so helpful in learning the ins and outs of teaching and research, and they have also been

supportive through the highs and lows of school and non-school events. I also want to give a

shout out to Chris Connolly for being a great conference roommate and lifting buddy, Kimbo

Yee for being a great teaching mentor and fellow fan of the Brody cafeteria, Catherine Gammon

for teaching me the true art of tea drinking, and Ian Cowburn for putting up with the whirring of

my stationary bike at all times of the day.

I owe a special thank you to my parents, brother, and grandparents. I would not be where

I am without your love and constant support. Lastly, I want to thank my soon-to-be wife, Laura

Kohn. You have been so understanding and patient with me through my doctoral work, allowing

me the time I need to complete my work but also making sure that I kept a work-life balance. I

cannot thank you enough for keeping me grounded through school and helping to make our

distance relationship work as well as it has. I love you and feel so lucky to get to spend my life

with you.

Page 7: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

vii

TABLE OF CONTENTS

LIST OF TABLES ...........................................................................................................................x

LIST OF FIGURES ...................................................................................................................... xii

KEY TO SYMBOLS AND ABBREVIATIONS ........................................................................ xiv

CHAPTER 1: INTRODUCTION .................................................................................................1

Physical activity and sedentary behavior .................................................................................1

Measurement of physical activity and sedentary behavior.......................................................2

SPECIFIC AIMS AND HYPOTHESES .........................................................................................9

CHAPTER 2: LITERATURE REVIEW...................................................................................13

Introduction ........................................................................................................................13

The influence of physical activity and sedentary behavior on health ................................14

Physical activity .....................................................................................................14

Sedentary behavior.................................................................................................15

Accelerometry as a preferred method to measure physical activity, energy expenditure,

sedentary behavior, and activity type.................................................................................23

Measurement methods ...........................................................................................23

The Large-Scale Integrated monitor and Caltrac ...................................................26

Linear regression ....................................................................................................28

Multiple regression ................................................................................................31

Measurement of sedentary behavior using accelerometers ...................................34

Machine learning ...................................................................................................36

Multiple sensor methods ........................................................................................41

Accelerometer placement.......................................................................................49

Laboratory-based vs. free-living settings ...............................................................60

Accelerometer reliability .......................................................................................64

Identifying non-wear ..............................................................................................66

Summary of current evidence and future directions ..........................................................69

CHAPTER 3: VALIDATION AND COMPARISON OF ACCELEROMETERS

LOCATED ON THE WRISTS, HIP, AND THIGH FOR FREE-LIVING ENERGY

EXPENDITURE PREDICTION ................................................................................................70

ABSTRACT ...................................................................................................................................70

INTRODUCTION .........................................................................................................................72

METHODS ....................................................................................................................................76

Summary of protocol .........................................................................................................76

Participants .........................................................................................................................76

Instrumentation ..................................................................................................................77

Page 8: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

viii

ActiGraph accelerometers ......................................................................................77

GENEA accelerometers .........................................................................................78

Oxycon portable metabolic analyzer .....................................................................78

Procedure ...........................................................................................................................79

Data reduction and modeling .............................................................................................82

Artificial neural networks ......................................................................................82

Window length .......................................................................................................85

Features ..................................................................................................................86

Size of the hidden layer..........................................................................................91

Oxycon data ...........................................................................................................92

Statistical analyses .............................................................................................................92

Power analysis ...................................................................................................................94

RESULTS ......................................................................................................................................96

DISCUSSION ..............................................................................................................................100

Study strengths and limitations ........................................................................................106

Conclusions ......................................................................................................................108

CHAPTER 4: COMPARISON OF ACTIVITY TYPE CLASSIFICATION ACCURACY

FROM ACCELEROMETERS WORN ON THE WRISTS, HIP AND THIGH.................110

ABSTRACT .................................................................................................................................110

INTRODUCTION .......................................................................................................................112

METHODS ..................................................................................................................................116

Summary of protocol .......................................................................................................116

Participants .......................................................................................................................116

Instrumentation ................................................................................................................116

ActiGraph accelerometers ....................................................................................117

GENEA accelerometers .......................................................................................117

iPAQ portable digital assistant and direct observation ........................................118

Procedure .........................................................................................................................118

Data reduction and modeling ...........................................................................................121

Artificial neural networks ....................................................................................121

Window length .....................................................................................................124

Features ................................................................................................................125

Activity type classification ..................................................................................129

Identifying non-wear ............................................................................................130

Direct observation ................................................................................................131

Statistical analyses ...........................................................................................................131

Power analysis .................................................................................................................133

RESULTS ....................................................................................................................................134

Confusion matrices ..........................................................................................................137

Activity categories ...........................................................................................................139

Activity intensity categories ............................................................................................141

DISCUSSION ..............................................................................................................................155

Strengths and limitations..................................................................................................163

Conclusions ......................................................................................................................164

Page 9: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

ix

CHAPTER 5: VALIDATION AND COMPARISON OF ACCELEROMETERS WORN

ON THE WRISTS, HIP, AND THIGH FOR MEASURING SEDENTARY BEHAVIOR

......................................................................................................................................................165

ABSTRACT .................................................................................................................................165

INTRODUCTION .......................................................................................................................167

METHODS ..................................................................................................................................172

Summary of protocol .......................................................................................................172

Participants .......................................................................................................................172

Instrumentation ................................................................................................................172

ActiGraph accelerometers ....................................................................................173

GENEA accelerometers .......................................................................................173

iPAQ portable digital assistant and direct observation ........................................174

Procedure .........................................................................................................................174

Data reduction and modeling ...........................................................................................177

Artificial neural networks ....................................................................................177

Assessing sedentary behavior using accelerometers............................................182

Direct observation ................................................................................................183

Statistical analyses ...........................................................................................................184

Power analysis .................................................................................................................185

RESULTS ....................................................................................................................................187

DISCUSSION ..............................................................................................................................194

Strengths and limitations..................................................................................................199

Conclusions ......................................................................................................................199

CHAPTER 6: DISSERTATION SUMMARY AND RECOMMENDATIONS...................201

Summary of results ..........................................................................................................201

Chapter 3: Estimation of energy expenditure ......................................................201

Chapter 4: Classification of activity type.............................................................205

Chapter 5: Estimation of sedentary behavior .......................................................209

Conclusions ..........................................................................................................212

Recommendations for future research .............................................................................218

APPENDICES ............................................................................................................................222

APPENDIX A: Consent form ...................................................................................................223

APPENDIX B: Recruitment flyer ............................................................................................227

APPENDIX C: Email flyer .......................................................................................................228

APPENDIX D: Supplemental figures ......................................................................................229

REFERENCES ...........................................................................................................................242

Page 10: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

x

LIST OF TABLES

Table 2.1. Comparison of wireless accelerometer systems for activity classification accuracy and

EE prediction accuracy ..................................................................................................................47

Table 2.2. Comparison of different monitor placements for activity classification accuracy and

EE prediction accuracy ..................................................................................................................58

Table 3.1. Activities performed during the simulated free-living protocol .....................................81

Table 3.2. Features used for EE prediction ....................................................................................90

Table 3.3. Feature sets used for creation and testing of ANNs .......................................................91

Table 3.4. Minimum Pearson correlations detectable for a given sample size and power ...............95

Table 3.5. Demographic characteristics of participants enrolled in study .......................................96

Table 3.6. Correlations of measured vs. predicted EE ...................................................................97

Table 3.7. Bias for measured vs. predicted EE...............................................................................99

Table 4.1. Activities performed during the simulated free-living protocol ...................................120

Table 4.2. Features used for EE and activity type prediction .......................................................128

Table 4.3. Feature sets used for creation and testing of ANNs .....................................................129

Table 4.4. Demographic characteristics of participants enrolled in study .....................................134

Table 4.5. Overall sensitivity, specificity, and AUC for each of the four accelerometer

placements for feature set 1 .........................................................................................................137

Table 4.6. Confusion matrix for activity type classification from a hip-mounted ActiGraph

accelerometer ...............................................................................................................................143

Table 4.7. Confusion matrix for activity type classification from a thigh-mounted ActiGraph

accelerometer ...............................................................................................................................144

Table 4.8. Confusion matrix for activity type classification from a GENEA accelerometer

mounted on the left wrist .............................................................................................................145

Page 11: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

xi

Table 4.9. Confusion matrix for activity type classification from a GENEA accelerometer

mounted on the right wrist ...........................................................................................................146

Table 4.10. Activity-specific sensitivity, specificity, and AUC among the four accelerometer

placement sites. ............................................................................................................................147

Table 4.11. Overall sensitivity, specificity, and AUC among the four accelerometer placement

sites with combined activity categories. ......................................................................................149

Table 4.12. Activities classified into activity intensities by the Compendium and by measured

METs............................................................................................................................................151

Table 4.13. Overall sensitivity, specificity, and AUC among the four accelerometer placement

sites for classification of activity intensity...................................................................................153

Table 5.1. Activities performed during the simulated free-living protocol ...................................176

Table 5.2. Features used for EE and activity type prediction .......................................................181

Table 5.3. Demographic characteristics of participants enrolled in study .....................................187

Table 5.4. Root mean square error for prediction of total time spent in SB and breaks in SB ...189

Table 6.1. Overall sensitivity, specificity, and AUC among the four accelerometer placement

sites for classification of activity intensity using the energy expenditure ANNs (developed in

Chapter 3).....................................................................................................................................214

Page 12: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

xii

LIST OF FIGURES

Figure 3.1 ANN for predicting EE .................................................................................................84

Figure 3.2. RMSE values for predicted vs. measured EE ..............................................................98

Figure 4.1. ANN for predicting activity type ...............................................................................123

Figure 4.2. Sensitivity for the four accelerometers, compared among feature sets ....................136

Figure 4.3. Comparison of dominant and non-dominant wrist accelerometer sensitivities ........154

Figure 5.1. ANN for predicting activity type and sedentary behavior ..........................................179

Figure 5.2. Predictions of total time spent in SB compared to a criterion measure (direct

observation)..................................................................................................................................188

Figure 5.3. Predictions of breaks in SB using a five-second interval .........................................191

Figure 5.4. Predictions of breaks in SB using a 30-second interval ...........................................191

Figure 5.5. Predictions of breaks in SB using a 60-second interval ...........................................192

Figure B.1. Recruitment flyer .......................................................................................................227

Figure D.1. Equipment worn by participants during the 90-min protocol. Participant shown is

performing the lying activity (T1) .................................................................................................229

Figure D.2. Example of participant performing reading activity (T2) ..........................................230

Figure D.3. Example of participant performing computer use activity (T3) .................................231

Figure D.4. Example of participant performing standing activity (T4) .........................................232

Figure D.5. Example of participant performing laundry activity (T5) ..........................................233

Figure D.6. Example of participant performing sweeping activity (T6) .......................................234

Figure D.7. Example of participant performing walking slow and fast activities (T7 and T8) ......235

Figure D.8. Example of participant performing jogging activity (T9) ..........................................236

Page 13: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

xiii

Figure D.9. Example of participant performing cycling activity (T10) .........................................237

Figure D.10. Example of participant performing stair use activity (T11) ......................................238

Figure D.11. Example of participant performing biceps curls activity (T12) ................................239

Figure D.12. Example of participant performing squats activity (T13) .........................................240

Figure D.13. Example of non-wear (T14) .....................................................................................241

Page 14: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

xiv

KEY TO SYMBOLS AND ABBREVATIONS

ANN artificial neural network

AUC area under the receiver operating characteristic curve

BMI body mass index

counts/minute accelerometer signal counts per minute

CSA Computer Science Application accelerometer

CV coefficient of variation

DO direct observation

EE energy expenditure

g gravitational force

HR heart rate

Hz hertz

IDEEA Intelligent Device for Energy Expenditure and Activity

kcal kilocalorie (or Calorie)

kcal/wear time kilocalories per hour of time the accelerometer was worn

kg kilogram

kg/m2 kilograms per meter squared

LPA light-intensity physical activity

LSI Large-Scale Integrated motor activity monitor

MET-hour metabolic equivalent hours

METs metabolic equivalents

ml milliliter

ml/kg/min milliliters per kilogram body mass per minute

Page 15: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

xv

mph miles per hour

MVPA moderate-to-vigorous intensity physical activity

NHANES National Health and Nutrition Examination Survey

PA physical activity

PDA personal digital assistant

r Pearson correlation

RMANOVA repeated measures analysis of variance

RMSE root mean square error

rpm revolutions per minute

SB sedentary behavior

SD standard deviation

TV television

VCO2 volume of carbon dioxide expelled

VO2 volume of oxygen consumed

x-axis vertical accelerometer axis

y-axis medial-lateral accelerometer axis

z-axis anterior-posterior accelerometer axis

Page 16: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

1

CHAPTER 1

INTRODUCTION

Physical activity and sedentary behavior

Physical activity (PA) is widely recognized for its beneficial effects on many aspects of

health, including reduced risk of obesity (King and Tribble 1991), hypertension (Paffenbarger,

Wing et al. 1983; Chobanian, Bakris et al. 2003), diabetes (Healy, Wijndaele et al. 2008),

cardiovascular disease (Paffenbarger, Hyde et al. 1986; Morris, Clayton et al. 1990), some cancers

(Thune and Furberg 2001), and all-cause mortality (Lee and Skerrett 2001). Based on most

evidence, the US Department of Health and Human Services recommends a minimum of 150

min/week of moderate-intensity PA or 75 min/week of vigorous-intensity PA, defined as activities

requiring an energy expenditure (EE) of at least 3.0 or 6.0 times the resting level (METs),

respectively, to experience health benefits (2008).

Activities below 3.0 METs do not qualify as moderate- or vigorous-intensity PA and

instead are labelled as either light-intensity PA or sedentary behavior. Sedentary behavior (SB) is

defined as a supine or seated activity requiring low levels of EE (< 1.5 METs) (Owen, Healy et al.

2010; SBRN 2012). Examples of SB include watching television (TV), using a computer, or

driving. SB has historically been viewed as a lack of moderate-to-vigorous PA (MVPA); however,

recent epidemiological and laboratory-based evidence suggests that SB elicits distinct physiologic

responses from MVPA, with high levels of SB associated with diminished metabolic (Hamilton,

Hamilton et al. 2004; Hamilton, Hamilton et al. 2007), cardiovascular (Schrage 2008), and bone

health (Zerwekh, Ruml et al. 1998) and increased risk of obesity (Hu, Li et al. 2003), some cancers

(Howard, Freedman et al. 2008), and all-cause mortality (Katzmarzyk, Church et al. 2009). It is

Page 17: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

2

important to note that the associations between SB and negative heath conditions exist

independently of total MVPA (Owen, Healy et al. 2010). These associations are especially

concerning given that technological advances (e.g., motor vehicles, TV, computers) have

contributed to an increased time spent sedentary (Matthews, Chen et al. 2008). Moreover, there

are several components of SB that may influence health, notably the total time spent in SB (Healy,

Wijndaele et al. 2008) as well as the number of times SB is broken up by non-sedentary activities

(breaks in SB) (Healy, Dunstan et al. 2008). Thus, it is important to be able to accurately measure

each of these components to determine the true influence of SB on health.

Despite the available evidence, there are still knowledge gaps regarding the specific effects

of PA and SB on health (PAGAC 2008). For example, there is not enough research into SB to

allow for evidence-based recommendations to be developed. Additionally, there is currently only

limited evidence of dose-response or threshold effects of SB on chronic health conditions such as

heart disease and cancer (Owen, Healy et al. 2010). These knowledge gaps are due mainly to the

absence of a single measurement tool that is valid for measuring both PA and SB and that can be

used for a variety of activities and environments (Owen, Healy et al. 2010). Without such a

measurement tool, researchers will be unable to accurately assess the relationship of PA and SB to

health outcomes, monitor precise levels of PA or SB, or evaluate the effectiveness of interventions

aimed to increase PA and decrease SB.

Measurement of physical activity and sedentary behavior

PA and SB can be assessed using a number of different methods, but accelerometers have

emerged as a preferred method of assessing free-living PA and SB due to their objectivity, minimal

participant burden, and rich data that can be collected for periods of up to 4-6 weeks and beyond

Page 18: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

3

(Welk 2002). Accelerometer data can be used to estimate energy expenditure (EE) and time spent

in various activity intensities (sedentary, light, moderate, vigorous). Accelerometers are generally

worn on the hip for comfort, convenience, and utility for measuring movements of the whole body.

Additionally, hip-mounted accelerometers have shown good utility for measuring ambulatory

activities (e.g., walking, running) in laboratory-based settings (Freedson, Melanson et al. 1998;

Rothney, Schaefer et al. 2008; Lyden, Kozey et al. 2011).

Traditionally, accelerometer data have been filtered and then translated into ‘activity

counts.’ These counts are then placed into simple linear regression equations to estimate EE

(Montoye, Washburn et al. 1983; Freedson, Melanson et al. 1998). Linear regression works well

for measuring the energy cost of ambulatory activities (i.e., walking and running), but it

dramatically under- or overestimates the EE requirement of many sedentary, lifestyle, and exercise

activities and does not allow for classification of activity type (i.e., classifying activities as sitting,

walking, running, cycling, etc.) (Crouter, Churilla et al. 2006; Rothney, Schaefer et al. 2008;

Lyden, Kozey et al. 2011). Other data processing methods, such as machine learning, have

recently evolved as successful alternatives for analyzing data collected from accelerometers.

Machine learning is the general term for an array of mathematical techniques that can be

used to recognize patterns in data and use those patterns to accurately predict activity type or EE.

Machine learning bears some similarities to linear regression; for example, both machine

learning and linear regression use one or more input (independent) variables (e.g., accelerometer

counts, heart rate, etc.) to predict an outcome (such as EE). However, unlike traditional linear

regression, machine learning techniques do not assume a simple relationship between

accelerometer counts and EE, and machine learning takes more information from the

accelerometer than just counts (e.g., monitor orientation, patterns of count accrual). Machine

Page 19: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

4

learning techniques are more complicated than linear regression, but they show improvements in

EE measurement (Rothney, Neumann et al. 2007; Staudenmayer, Pober et al. 2009) and allow

for classification of activity type (Khan, Lee et al. 2008; Khan, Lee et al. 2010; Trost, Wong et

al. 2012), thereby allowing estimation of time spent in SB and breaks in SB. Currently, there is

no consensus on which machine learning technique is best for EE measurement or activity

classification; however, artificial neural networks (ANNs) have received the most use in

kinesiology-based studies because they can be used to predict both continuous variables (such as

EE) and categorical variables (such as activity type classification). Additionally, ANNs can be

applied to data from commonly used accelerometers and can be developed from freely available

software packages (e.g., R statistical software) (Staudenmayer, Pober et al. 2009).

Despite the emphasis on measurement of EE and time spent in MVPA, classification and

measurement of SB have lagged behind. Only very recently have validation studies been

conducted specifically to assess the ability of accelerometers to accurately measure time spent in

SB and breaks in SB, and these studies have yielded mixed results (Grant, Ryan et al. 2006;

Kozey-Keadle, Libertine et al. 2011; Lyden, Kozey Keadle et al. 2012). Additionally, SB has

rarely been included in protocols utilizing machine learning for EE and activity classification

(Freedson, Lyden et al. 2011), leaving a large gap in the literature regarding the utility for

accelerometer measurement of SB. Finally, standing has often been considered a type of SB, but

standing involves significant contraction of muscles in the legs and postural muscles and does not

have many of the negative physiologic effects of prolonged sitting or lying (Hamilton, Hamilton et

al. 2004; Hamilton, Hamilton et al. 2007). Additionally, it may be that different types of SB elicit

different amounts of muscle contraction (e.g., sitting at a computer might require postural muscles,

while lying down may not). Therefore, an accurate measurement tool must be able to differentiate

Page 20: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

5

standing from SB and also differentiate among different types of SB in order to gain an

understanding of the true health risks of SB.

Ultimately, there must be a balance between quality of information/data collected in a

research study vs. the burden on participants and researchers. For accelerometer-based activity

monitor data, use of multiple monitors and collection of several physiologic variables can

improve EE measurement and activity classification (Rothney, Neumann et al. 2007; De Vries,

Garre et al. 2011; Dong, Biswas et al. 2013) . However, more monitors also increases participant

burden dramatically, which may lower compliance rates and, consequently, reduce the amount

and quality of data collected. Additionally, large-scale studies cannot easily use multiple

monitors due to the dramatic increase in time, burden, and cost necessary to collect, process, and

analyze the data. Use of a single activity monitor that can collect data on one or more variables

is strongly preferred for large, free-living studies due to ease of use for participants and

researchers while still providing a valid measurement of the PA outcome variable(s) of interest.

Additionally, machine learning techniques are much more complex to use and understand than

traditional linear regression techniques. In order to make machine learning suitable for

researchers to use, current approaches to developing and using machine learning must be

simplified as much as possible without losing measurement accuracy. In summary, there is a

need to refine the methodology for using a single activity monitor for measurement of EE and

classification of both SB and non-sedentary activities, especially in free-living settings;

additionally, efforts to reduce the complexity of machine learning will make this approach more

accessible to researchers who want to measure PA, EE, and/or SB but who are not measurement

specialists.

Page 21: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

6

Hip-mounted accelerometers are commonly used for comfort and utility for measuring

ambulatory activities, but they may offer a more limited ability to classify certain types of

activities. Machine learning techniques have been applied to hip-mounted accelerometers with a

high degree of success for measuring EE (Rothney, Neumann et al. 2007; Staudenmayer, Pober

et al. 2009; Trost, Wong et al. 2012) and activity type (when assessing non-sedentary activities)

(Khan, Lee et al. 2008; Bonomi, Plasqui et al. 2009; Khan, Lee et al. 2010). Conversely, these

techniques have rarely been used for measurement of total SB, distinguishing among standing

and different types of SB, or measuring breaks in SB (Freedson, Lyden et al. 2011; Trost, Wong

et al. 2012). Given the previous success of ANNs for improvement of activity type classification

and EE prediction, it is likely that creation of ANNs trained on data that include both sedentary

and non-sedentary activities will further improve assessment of activity type classification, time

spent in SB, and breaks in SB while also improving EE measurement. Our study will address

this shortcoming in the literature by creating and validating an ANN based on both sedentary and

non-sedentary activities for a hip-mounted ActiGraph accelerometer. This ANN will be tested

for its utility to correctly classify activity type and measure time in SB, breaks in SB, and EE.

While the hip is the most common accelerometer placement location for measuring

activity, there is evidence that placement on the other parts of the body, such as the thigh and

wrist, can yield similar or slightly better accuracy for measuring PA and EE (Bouten, Sauren et

al. 1997; Bao and Intille 2004; De Vries, Garre et al. 2011; Esliger, Rowlands et al. 2011;

Mannini, Intille et al. 2013). Additionally, there is consistent evidence that thigh-mounted

accelerometers can accurately measure total SB and breaks in SB, which may not be true of a

hip-mounted accelerometer (Grant, Ryan et al. 2006; Hart, Ainsworth et al. 2011; Kozey-Keadle,

Libertine et al. 2011; Lyden, Kozey Keadle et al. 2012). However, to date, no published study

Page 22: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

7

has evaluated a thigh-mounted accelerometer for its utility in assessing measurement of EE or

classification of both PA and SB. Thus, our study will also develop and test an ANN for

classifying activity type and measuring SB and EE using data from a thigh-mounted ActiGraph

accelerometer.

The GENEA is a newly developed accelerometer (designed to be worn on the wrist) that

has four functions that serve to dramatically increase compliance and non-wear determination: 1)

it has a thin, low-profile design, 2) it is waterproof, 3) it has a battery life and memory capacity

of up to 45 days, and 4) it has a temperature sensor to help detect when it is being worn.

Therefore, the monitor does not need to be removed for any reason during data collection, and if

it is, the temperature sensor will help to determine exact wear-time. These advantages, along

with the GENEA’s raw data recording and reasonable price, make the GENEA ideal for

measuring EE and SB and classifying activity type in free-living situations and large studies.

Using the traditional, cut-point approach, the GENEA has been shown to have high accuracy for

measuring EE (r>0.80) in a validation study when worn on the hip and wrist (Esliger, Rowlands

et al. 2011) but much lower accuracy for classifying activity intensity in a cross-validation of the

cut-points (Welch, Bassett et al. 2013; Welch, Bassett et al. 2014). Recently, the wrist-worn

GENEA was tested using machine learning and showed high accuracy (>95% classification

accuracy) for identifying 10-12 types of activities in a laboratory-based setting (Zhang,

Rowlands et al. 2012). However, there are still many unanswered questions regarding the

GENEA, including the ability to use machine learning to predict EE and measure total SB and

breaks in SB, especially in a free-living environment. Therefore, our study developed and tested

ANNs to measure SB and EE and classify activity type from raw data obtained from two wrist-

mounted GENEA accelerometers. Additionally, it is conventional to wear wrist-mounted

Page 23: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

8

accelerometers to be worn on the non-dominant wrist due to perceived superior measurement

accuracy, but there is little evidence to support this convention. Therefore, the current study

tested and compared monitors worn on both wrists to examine differences in accuracy between

the dominant and non-dominant wrists.

Finally, measurement techniques are often validated for use in heavily controlled,

laboratory-based settings (Freedson, Melanson et al. 1998; Esliger, Rowlands et al. 2011; Zhang,

Rowlands et al. 2012; Dong, Montoye et al. 2013). This method is important for providing a

proof-of-concept that a technique can accurately measure what it is supposed to measure and

identify potential limitations of the measurement technique. However, laboratory settings are

very different than free-living conditions, and there is considerable evidence showing that

predictive models developed in laboratory validations do not work well when applied to free-

living settings (Hendelman, Miller et al. 2000; Swartz, Strath et al. 2000; Freedson, Lyden et al.

2011; Gyllensten and Bonomi 2011; van Hees, Golubic et al. 2013; Welch, Bassett et al. 2014).

Therefore, it is important to incorporate aspects of a free-living setting into validation studies to

increase their real-world generalizability.

In summary, our study developed and assessed the accuracy of ANNs for the

measurement of EE, SB, and activity type using data collected from hip- and thigh-mounted

ActiGraph accelerometers and two wrist-mounted GENEA accelerometers. These ANNs were

created and validated in a free-living simulation, using a portable metabolic analyzer as the

criterion measure of EE and direct observation (DO) as the criterion measure of activity type,

SB, and breaks in SB.

Page 24: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

9

SPECIFIC AIMS AND HYPOTHESES

Objective 1: In a simulated free-living setting, create and test an ANN to estimate EE for a hip-

mounted ActiGraph GT3X+ accelerometer, a thigh-mounted ActiGraph GT3X+ accelerometer,

and two wrist-mounted GENEActiv accelerometers (total of four ANNs).

Aim 1: Create EE ANNs for the three accelerometers using simple-to-understand accelerometer

signal features and a freely available software package and test a range of potential features to

identify which are most relevant for inclusion in the ANNs. This aim is not hypothesis-driven.

Aim 2: Assess the criterion validity of the hip-, thigh-, and wrist ANNs developed for the four

accelerometers for estimating EE, using EE measured by a portable metabolic analyzer as a

criterion.

- Hypothesis 2a: All four accelerometers would have at least moderately high validity for

measuring EE, as demonstrated by Pearson correlation coefficients of r≥0.60.

- Hypothesis 2b: The thigh-mounted accelerometer would have the highest accuracy (as

represented by the lowest root mean square error [RMSE] and highest Pearson

correlations [r]) for predicting EE, and the wrist-mounted accelerometers would have the

lowest accuracy (highest RMSE and lowest r values) for predicting EE. The hip

accelerometer placement would be significantly less accurate than the thigh but

significantly more accurate than the wrist accelerometers. Differences among RMSE and

r values were evaluated using repeated-measures analysis of variance (RMANOVA).

Page 25: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

10

Hypothesis 2c: Accuracy for predicting EE would be similar for the accelerometers worn on the

dominant and non-dominant wrists. Differences between RMSE and r values for the left and

right wrist placement sites were evaluated using RMANOVA.

Objective 2: In a simulated free-living setting, create and test ANNs to correctly classify activity

type from a hip-mounted ActiGraph GT3X+ accelerometer, a thigh-mounted ActiGraph GT3X+

accelerometer, and two wrist-mounted GENEActiv accelerometers (total of four ANNs).

Aim 3: Create activity type ANNs using simple-to-understand accelerometer signal features and

a freely available software package and to evaluate the utility of different sets of accelerometer

features for inclusion in the ANNs. This aim is not hypothesis-driven.

Aim 4: Assess the criterion validity of the ANNs for the four accelerometers for classifying

activity type, using direct observation (DO) of activity type as a criterion measure. For

hypotheses 4b-4f, differences among accelerometer placement sites were evaluated by

RMANOVA.

- Hypothesis 4a: Overall classification accuracies (determined by sensitivity of the ANNs)

would be at least 70% for the thigh-, hip-, and wrist-mounted accelerometers.

- Hypothesis 4b: Overall classification accuracy would be significantly higher for the

thigh-mounted accelerometer than the hip- or wrist-mounted accelerometers.

- Hypothesis 4c: For ambulatory activities (walking, jogging) and climbing/descending

stairs, all four accelerometers would have classification accuracies no more than 5%

different among accelerometers.

Page 26: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

11

- Hypothesis 4d: For lifestyle activities (laundry and sweeping) and exercise activities

(biceps curls and squats), the wrist-mounted accelerometers would yield significantly

higher classification accuracy than the hip- or thigh-mounted accelerometers.

- Hypothesis 4e: For SB (lying and sitting), standing, and cycling, the thigh-mounted

accelerometer would yield significantly higher classification accuracy than the hip- or

wrist-mounted accelerometers.

- Hypothesis 4f: The dominant and non-dominant wrist accelerometers would yield

classification accuracies not significantly different from each other.

Objective 3: In a simulated free-living setting, use the activity type ANNs (created in Aim 3) for

the four accelerometers for determining total time spent in SB and breaks in SB.

Aim 5: Assess the criterion validity of the activity type ANNs developed for the four

accelerometers for estimating total time spent in SB, using DO as the criterion measure. For

hypotheses 5a-5c, differences among accelerometer placement sites and the criterion measure

were evaluated using RMANOVA.

- Hypothesis 5a: Total time spent in SB estimated from the thigh-mounted accelerometer

would not be significantly different from DO-measured total time spent in SB (i.e., the

thigh-mounted accelerometer would accurately measure total time spent in SB).

- Hypothesis 5b: The wrist-mounted accelerometers would significantly underpredict total

time spent inSB compared to that measured by DO.

- Hypothesis 5c: The hip-mounted accelerometer would significantly overpredict total time

spent in SB compared to that measured by DO.

Page 27: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

12

Aim 6: Assess the criterion validity of the ANNs developed for the three accelerometers for

classifying breaks in SB, using DO as the criterion measure. For hypotheses 6a-6c, differences

among accelerometer placement sites and the criterion measure were evaluated using

RMANOVA.

- Hypothesis 6a: Breaks in SB estimated from the thigh-mounted accelerometer would not

be significantly different from DO-measured breaks in SB (i.e., the thigh-mounted

accelerometer will accurately measure breaks in SB).

- Hypothesis 6b: The wrist-mounted accelerometers would significantly overpredict breaks

in SB compared to that measured by DO.

- Hypothesis 6c: The hip-mounted accelerometer would significantly underpredict breaks

in SB compared to that measured by DO.

This dissertation is split up into several chapters. Chapter 2 provides a comprehensive

review of the literature regarding the use of accelerometers to measure physical activity and

sedentary behavior. Then, Chapter 3 addresses Objective 1 (EE estimation), Chapter 4 addresses

Objective 2 (activity type prediction), and Chapter 5 addresses Objective 3 (sedentary behavior

measurement). Finally, Chapter 6 summarizes the findings of the dissertation and provides

areas for further study.

Page 28: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

13

CHAPTER 2

LITERATURE REVIEW

Introduction

Both PA and SB have been shown to influence health. However, the bulk of research

conducted to date has focused on PA, with SB being classified as a lack of PA. This

conventional definition where PA and SB are on opposite ends of a continuum fails to recognize

the complex nature of SB or the independent effects being sedentary may have on people’s

health, even when they obtain the recommended weekly PA dose (Pate, O'Neill et al. 2008;

Dunstan, Howard et al. 2012). For PA, SB, and EE measurement in surveillance, observational,

and intervention studies, recall methods are commonly used due to their low cost and minimal

burden on both participants and researchers. However, accelerometer-based measurement of PA,

SB, and EE is preferred due to its objectivity and potentially improved capability for accurate

measurement of these variables (Welk 2002).

With recent technological improvements in accelerometer capabilities, machine learning

has become a popular method used to process and analyze accelerometer data. While former

processing techniques could only measure EE or activity intensity and were developed for hip-

mounted accelerometers, machine learning allows researchers to use accelerometers to measure

EE and classify activity type when worn on the hip or other parts of the body (Preece, Goulermas

et al. 2009). Hip placement works well for ambulatory activities (Rosenberger, Haskell et al.

2013), and wrist placement improves compliance and allows for sleep measurement (Mannini,

Intille et al. 2013; Rosenberger, Haskell et al. 2013). However, while hip placement is better

than wrist placement for measurement of SB (Rosenberger, Haskell et al. 2013) neither hip nor

Page 29: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

14

wrist placement allow for acceptable accuracy for measurement of SB (Lyden, Kozey Keadle et

al. 2012; Rosenberger, Haskell et al. 2013), which may be due partly to lack of sedentary

activities used to train machine learning algorithms. Thigh placement appears optimal for

measuring SB (Kozey-Keadle, Libertine et al. 2011; Lyden, Kozey Keadle et al. 2012).

Additionally, a few studies showing successful use of a thigh-mounted accelerometer for

classification of SB and non-sedentary activities (Skotte, Korshoj et al. 2012; Dong, Montoye et

al. 2013) and estimation of EE (Metcalf, Curnow et al. 2002) provide preliminary evidence that

the thigh placement may be the ideal solution for comprehensive measurement of PA, SB, and

EE. The current study will directly compare the utility of hip-, thigh-, and wrist-mounted

accelerometers classifying SB and non-sedentary activities and measuring total time in SB,

breaks in SB, and EE in a simulated free-living setting. This literature review begins by

discussing the independent risks of low PA and high SB on multiple health outcomes. Then, the

review addresses the strengths and weaknesses of available measurement methods, focusing on

the progression in the use of accelerometers and the current state of accelerometer use. Finally,

this review highlights several gaps that exist in measurement of EE, SB, and activity type,

leading to the rationale for the current study.

The influence of physical activity and sedentary behavior on health

Physical activity

PA has long been recognized for its importance in maintaining and improving health.

Historical figures such as Hippocrates recognized the beneficial effects of PA on health as early

as 400 B.C., writing the following in his book called Regimen: “Eating alone will not keep a man

well; he must also take exercise” (Precope 1952). Since that time, substantial evidence has been

Page 30: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

15

collected to support the role of PA in lowering risk of depression (Martinsen, Hoffart et al. 1989),

obesity (King and Tribble 1991; Blair 1993), hypertension (Paffenbarger, Wing et al. 1983;

Chobanian, Bakris et al. 2003), type II diabetes (Manson, Nathan et al. 1992; Healy, Wijndaele et

al. 2008), cardiovascular disease (Paffenbarger, Hyde et al. 1986; Morris, Clayton et al. 1990),

some cancers (Shephard 1990; Thune and Furberg 2001; Slattery 2004), and all-cause mortality

(Paffenbarger, Hyde et al. 1986; Kampert, Blair et al. 1996). By the 1990s, the evidence was

sufficient to recommend a minimum of 30 min/day of PA on most or all days of the week to

achieve health benefits (Pate, Pratt et al. 1995). Since the these recommendations, several

updates have been published, and the most recent recommendations include a more specific dose

of PA (150 min/week in at least moderate intensity, defined as any activity eliciting an EE of at

least 3.0 METs), separate recommendations for resistance training, and separate or modified

recommendations for children, older adults, adults with disabilities, and pregnant individuals

(2008). PA is commonly measured in min/day for comparison to recommendations, but PA can

also be assessed indirectly through measuring EE, which is useful in terms of energy balance and

assessing total PA. Therefore, an ideal measurement tool should be able to measure both

constructs.

Sedentary behavior

Anyone who does not meet the national recommendations of obtaining at least 150

min/week of MVPA has traditionally been considered sedentary (2008; Pate, O'Neill et al. 2008).

However, Pate et al. (Pate, O'Neill et al. 2008) emphasize that there is a marked difference between

being sedentary and being physically inactive. While is it often the case that individuals are

physically inactive (do not meet PA recommendations) and engage in large amounts of SB, it is

also fairly common for people to engage in high amounts of PA and SB (Pate, O'Neill et al. 2008;

Page 31: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

16

Troiano, Berrigan et al. 2008), a categorization Owen et al. call the “Active Couch Potatoes”

(Owen, Healy et al. 2010). To better address the problem of SB, the Sedentary Behavior Research

Network recently redefined “sedentary” to indicate time spent in seated or supine behaviors (e.g.,

TV watching, computer use, and driving) that elicit an EE of 1.0-1.5 METs (Ainsworth, Haskell et

al. 2011; SBRN 2012). It is important to assess these behaviors (PA and SB) separately as

evidence is accumulating that each behavior appears to exert independent effects on health.

The classic 1953 study by Morris et al. (Morris, Heady et al. 1953) highlighted the potential

influence on SB on health by recognizing differences in heart disease incidence between London’s

bus drivers when compared to bus conductors. While the bus drivers spent the vast majority of

their workday sitting in their driver seats, the conductors were constantly on their feet,

accumulating little SB but a lot of LPA and some MVPA while walking through the double-decker

bus and going up and down the stairs. Incidence of heart disease was higher in the drivers than the

conductors, providing evidence that having high PA and low SB is associated with lower risk of

developing heart disease. However, despite this initial evidence, follow-up studies focused less on

SB and more on PA. Given that PA is easier to measure (especially with recall as the only

available field method at the time) (Healy, Clark et al. 2011) and is arguably easier to prescribe as

part of a lifestyle intervention, it is not surprising that follow-up research focused on the effects of

PA and health.

The importance of SB as a determinant of health returned to prominence in the last 10-15

years with the recognition that our society is becoming increasingly sedentary (Matthews, Chen et

al. 2008), likely due to technological advances which increase the number of sedentary jobs and

allow for more motorized transportation. Using National Health and Nutrition Examination

Survey (NHANES) data, Matthews et al. (Matthews, Chen et al. 2008) found that adults spend

Page 32: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

17

over 50% of their waking time (7.7 hours/day) engaged in SB (Matthews, Chen et al. 2008), while

Troiano found that average adults spend only 27-29 min/day engaged in MVPA (Troiano, Berrigan

et al. 2008). Together, this information indicates that adults spend an average of more than 10

times as much time in SB as in MVPA. Since SB comprises such a large percentage of the average

person’s day, it is not surprising that SB has been linked to an array of health outcomes, including

obesity (Shields and Tremblay 2008), metabolic and cardiovascular health (Healy, Dunstan et al.

2007), and all-cause mortality (Katzmarzyk, Church et al. 2009).

From an energy balance perspective, SB requires much less energy than LPA or MVPA,

resulting in lower daily EE and increasing risk for weight gain and obesity (Levine, Eberhardt et al.

1999; Hu, Li et al. 2003; Levine, Lanningham-Foster et al. 2005). For example, for a person with a

resting EE of 70 kcal/hour, replacing two hours of SB with two hours of LPA could burn an extra

140 kcal/day ([2.5 METs *2 hours] /[1.5 METs*2 hours] * 70 kcal/MET-hour), which is more than

the 105 kcal required to walk at a moderate intensity for 30 minute (3 METs * 0.5 hours = 1.5

MET-hour * 70kcal/MET-hour). In fact, if this person maintained a constant energy balance but

wanted to lose weight, replacing two hours of SB with LPA (and holding all other factors constant)

would result in losing one pound of body weight every 24 days ([3,500 kcal/lb] / [140 kcal/day]),

or over 15 pounds in a year. In a laboratory-based study of 20 adults, Swartz et al. (Swartz,

Squires et al. 2011) put this theory into action, measuring EE while having participants complete

four activity protocols. Each protocol lasted for 30 minutes; all four bouts started with SB, and

then the participant either continued to sit or broke their SB with a one-, two-, or five-minute walk

at a self-selected pace. After extrapolating the results to the standard eight-hour workday,

participants would burn 132 kcal/day more by taking five-minute walking breaks every 30 minutes

(total of 80 minutes of walking) than by sitting for the entire eight hours. In summary, SB can

Page 33: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

18

have important implications for total energy balance and maintenance or attainment of a healthy

body weight.

In addition to SB resulting in lower EE, high amounts of SB and prolonged SB have been

shown to negatively affect metabolic and cardiovascular health in laboratory-based studies.

Studies in mice and rats have introduced forced SB by immobilizing the animals’ hind limbs, and

these studies show that in as little as a few hours, the muscular unloading caused by prolonged SB

can result in reduced insulin sensitivity (Seider, Nicholson et al. 1982; Hamilton, Hamilton et al.

2007), poor glucose transport (Ploug, Ohkuwa et al. 1995), and suppression of muscle lipoprotein

lipase (Bey and Hamilton 2003; Zderic and Hamilton 2006). Additionally, bed rest studies in

humans reveal major negative changes in insulin sensitivity (sometimes reducing sensitivity by

40% or more) (Stuart, Shangraw et al. 1988; Mikines, Richter et al. 1991; Smorawinski, Kubala et

al. 1996; Bergouignan, Rudwill et al. 2011), high-density lipoprotein cholesterol levels

(Yanagibori, Suzuki et al. 1997; Yanagibori, Kondo et al. 1998), and increased risk of blood clots

(Bird 1972; Kierkegaard, Norgren et al. 1987) within the first day spent in bed. Similarly,

impaired insulin action (Tobin, Uchakin et al. 2002) and blood pressure responses (Hargens and

Richardson 2009) have been observed with spaceflights and simulated microgravity. All three of

these research avenues point toward the contribution of SB and a lack of breaks in SB to carrying

negative health consequences; however, results from animal studies cannot be directly applied to

humans, and bed rest and spaceflight studies represent an extreme situation to which humans are

rarely exposed, limiting their generalizability to typical, free-living SB. Importantly, standing,

which is often considered a sedentary activity, does not fit the definition of a sedentary behavior

because it is not a supine or seated posture, even though it does elicit an energy cost of less than

1.5 METs (Ainsworth, Haskell et al. 2011). Moreover, standing requires significant and prolonged

Page 34: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

19

contraction of major muscle groups in the legs, and this does not fit the proposed mechanism for

many of the negative physiologic effects seen with prolonged sitting or lying (Hamilton, Hamilton

et al. 2004; Hamilton, Hamilton et al. 2007). Therefore, standing likely does not affect health in

the same way as SB and must be assessed as a separate construct when identifying the health risks

of SB.

To support the evidence from these laboratory-based and spaceflight studies, data from

several large epidemiologic studies have been used to assess links between SB and health

outcomes, both longitudinally and cross-sectionally. Cross-sectional studies have added

considerably to our knowledge of SB and its relationship to health. Healy and colleagues have

published several studies assessing the associations between SB and cardiometabolic health

(Healy, Dunstan et al. 2007; Healy, Dunstan et al. 2008; Healy, Dunstan et al. 2008; Healy,

Wijndaele et al. 2008; Wijndaele, Healy et al. 2010; Healy, Matthews et al. 2011). Using

accelerometer-derived SB (≤100 counts/min using the ActiGraph accelerometer), they found that

US adults in the highest quartile of SB had several adverse cardiometabolic biomarkers, including

32% higher insulin and 12% higher C-reactive protein levels, a 5% drop in high-density

lipoprotein, and a 1.6 cm larger average waist circumference when compared to adults in the

lowest SB quartiles (NHANES data) (Healy, Matthews et al. 2011). Similarly, in a subsample of

participants enrolled in the Australian Diabetes, Obesity and Lifestyle Study, a 30-min decrease

in SB was associated with a 7% lower waist circumference, and a similar drop in clustered

metabolic risk score (Healy, Dunstan et al. 2007; Healy, Wijndaele et al. 2008). Additionally, in

several different samples, Healy et al. have found that adults in the highest quartile for rates of

breaking up SB with short periods of non-sedentary activity tend to have better metabolic health as

well as a 5% lower waist circumference than adults in the lowest quartile (Healy, Dunstan et al.

Page 35: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

20

2008). Healy et al. also found an inverse dose-response relationship of SB breaks with BMI and

plasma glucose, independent of total PA or SB (Healy, Dunstan et al. 2007; Healy, Dunstan et al.

2008; Healy, Matthews et al. 2011). These cross-sectional studies provide strong evidence of

associations between SB and several health indices, but from these alone we cannot establish cause

and effect.

Longitudinal evidence has also shown some support for the link between SB and many

health conditions, although the evidence is less conclusive than in cross-sectional work. A 2011

review by Thorp et al. (Thorp, Owen et al. 2011) provides good insight into the state of the

longitudinal evidence concerning the link between SB and health outcomes. Of the 48 studies

included, 45 used self-report measures (TV watching and/or total sitting time), one used HR, one

used both HR and self-report, and only one used accelerometry to measure PA and SB. Thorp’s

review showed consistent evidence of an association between high levels of SB and risk of

cardiovascular disease, all-cause mortality, and obesity. In many of the studies, the authors

statistically controlled for BMI and time spent in MVPA, but few accounted for variables such as

education or socioeconomic states. In two studies included in the review, those in the highest SB

category had 54-130% increased risk of cardiovascular disease and 52-54% increased risk of all-

cause mortality in 4- and 12-year follow-ups (Katzmarzyk, Church et al. 2009; Stamatakis, Hamer

et al. 2011). Similarly, two other studies (6.6- and 10-year follow-ups) showed a dose-response,

with each hour of extra television watched per day increasing risk of cardiovascular disease by 7-

18% and all-cause mortality by 4-11%. In relation to obesity risk, several studies showed that

high SB in childhood was related to a 22-42% increased risk of obesity in early adulthood (Boone,

Gordon-Larsen et al. 2007; Erik Landhuis, Poulton et al. 2008).

Page 36: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

21

Thorp’s review also shows some evidence of an association between SB and risks of

developing diabetes and certain types of cancer. For example, two studies found dose-response

relationships between SB and risk of developing diabetes, with the highest SB group having a 61-

187% increased risk of developing diabetes in 8-10 year follow-ups (Hu, Leitzmann et al. 2001;

Ford, Schulze et al. 2010); in another study, each two-hour increase in SB was associated with a 7-

14% increase.in diabetes risk during a 6-year follow-up (Hu, Li et al. 2003). However, in the two

studies using objectively measured SB (HR and accelerometry), there were conflicting results

regarding the relationship between SB and insulin resistance (Ekelund, Brage et al. 2009;

Helmerhorst, Wijndaele et al. 2009), and some of these studies also find that controlling for other

factors such as PA moderate the associations. Similarly, in relation to cancer risk, two studies

(with 9- and 10-year follow-ups) found that those with high SB had a 55% increased risk of

developing ovarian cancer in females and a 61% increased risk of developing colon cancer in

males (but not females) (Patel, Rodriguez et al. 2006; Howard, Freedman et al. 2008), although

findings from other studies and other types of cancer have been mixed (Howard, Freedman et al.

2008; Gierach, Chang et al. 2009). These findings are intriguing but far from conclusive,

warranting more research examining SB in relation to these outcome variables.

Moreover, in eight of the studies, PA appeared to mediate the effects of SB on health

outcomes, casting some doubt of the robustness of SB as a risk factor independent of PA. In one

such study, Katzmarzyk examined self-reported time spent standing and mortality and found an

inverse dose-response relationships between standing time and both mortality and cardiovascular

disease, but only among those with low PA levels (Katzmarzyk 2014). Yet, the considerable

variation in self-report instruments used and the paucity of research using objective measures of

PA or SB severely limits our understanding of the true risk of SB on health or what levels of SB

Page 37: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

22

are appropriate for maintaining or enhancing health. Previous evidence indicates that

accelerometers yield higher quality data and stronger associations with health outcomes than self-

report (Reilly, Penpraze et al. 2008; Celis-Morales, Perez-Bravo et al. 2012) and recent evidence

from a review by Atkin et al. (Atkin, Gorely et al. 2012) found that most self-report measures of

SB have poor validity. Therefore, it is likely that objectively-measured PA and SB will yield

stronger and more consistent associations of SB with health and greatly enhance our understanding

of the ways in which these behaviors influences health.

In conclusion, experimental studies have shown that prolonged SB has negative effects on

metabolic variables that contribute to long-term disease risk. Also, there is evidence from cross-

sectional and longitudinal studies showing that TV watching and overall SB have a strong and

consistent association with risk of several chronic diseases, although results were based on poor

measures of PA and SB. However, to continue to determine the specific effects and true risk of SB

on health, discover patterns of PA and SB associated with increased disease risk, and develop

national recommendations for SB to improve health, methods for objective measurement of SB

need to be utilized and refined for use in observational and intervention research.

The next section of this literature review focuses on the progression of methods that have

been used for measurement of PA and SB, limitations of the current methods, and gaps in the

literature that are addressed with the current study.

Page 38: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

23

Accelerometry as a preferred method to measure physical activity, energy expenditure,

sedentary behavior, and activity type

Measurement methods

Many methods have been developed and used for measuring EE, PA, and SB. For

smaller, laboratory-based studies, methods such as direct or indirect calorimetry can be used to

obtain very accurate measurements of EE, and direct observation (DO) can be used to accurately

record the time and type of PA being performed. However, calorimetry and DO are impractical

for use in public health, surveillance, and epidemiologic research because these types of studies

involve measurement of a large number of participants outside of the controlled laboratory

environment.

In large-scale studies, self-report measures such as questionnaires, diaries, and interviews

are often used to measure EE, PA, and SB. Self-report measures are relatively inexpensive, can

yield estimates of EE, and can provide information about the timing, frequency, and types of PA

and SB performed (Sallis and Saelens 2000). However, self-report is vulnerable to recall bias

and substantial reporting error (LaPorte, Montoye et al. 1985; Sallis and Saelens 2000; Shephard

2003; van Poppel, Chinapaw et al. 2010). Measurement errors associated with self-report reduce

or attenuate associations between PA or SB and disease (Frost and White 2005; Lagerros and

Lagiou 2007); as a result, statistical power decreases when trying to detect significant

relationships between self-reported measures of EE,PA, or SB and health outcomes, and the risk

of type II error increases (Beaton, Milner et al. 1979; MacMahon, Peto et al. 1990).

Additionally, measurement error reduces researchers’ ability to obtain valid measurements of

EE, PA, and SB and hinders efforts to detect meaningful changes in these variables that may

Page 39: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

24

occur as the result of lifestyle interventions (Dale, Welk et al. 2002; Healy, Clark et al. 2011;

Matthews, Moore et al. 2012).

Self-report has been used to measure PA and SB with varying levels of success.

Generally, both PA and SB can be measured with only low-moderate validity (van Poppel,

Chinapaw et al. 2010; Healy, Clark et al. 2011; Lyden 2012), although MVPA can be assessed

with higher validity than SB (Matthews, Moore et al. 2012). It is not surprising that self-report is

more successful for measuring PA (especially MVPA) than SB. In adults, MVPA usually occurs

during structured or planned activities and can be recalled with better accuracy than SB, which is

typically more intermittent in nature and is, therefore, more difficult to recall (Healy, Clark et al.

2011). In addition, few self-report tools are properly designed for measurement of SB. SB has

traditionally been assessed using proxy measures such as time spent watching TV, driving, using

a computer, work-based sitting time, and/or total screen time. In a recent review of the literature,

Healy et al. (Healy, Clark et al. 2011), found that most studies support that specific sedentary

activities can be recalled with acceptable reliability and validity (intraclass correlation > 0.50 and

Pearson/Spearman correlations >0.40). However, self-report of total SB generally has lower

validity (Pearson/Spearman correlations < 0.40) when compared to accelerometer-derived SB in

adults (Hagstromer, Oja et al. 2006; Healy, Clark et al. 2011).

Similarly, it appears that breaks in SB cannot be accurately assessed using self-report. In

2011, Clark et al. (Clark, Thorp et al. 2011) found that 121 adult office workers recalled total SB

with moderate validity (r=0.39) but had poor validity for recalling breaks in SB (r=0.26) during

the work day. Moreover, most self-report measures contain few questions about sedentary

activities or total SB and no questions about breaks in SB, making measurement of SB

impossible using many current self-report tools (Healy, Clark et al. 2011).

Page 40: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

25

Limitations of self-report methods have led researchers to use pedometers, heart rate

(HR) monitors, and accelerometers for objective measurement of EE, PA, and SB. Of these

methods, pedometers can only measure steps taken and, therefore, provide no information on SB,

activity intensity, or activity duration (Tudor-Locke and Myers 2001). HR monitors provide a

good estimation of moderate-to-vigorous PA (MVPA), but optimal accuracy is dependent on

developing individualized curves that match HR to EE values, which can vary considerably

among people of different ages and cardiorespiratory fitness levels (Janz 2002). Additionally,

HR monitors have limited utility for measuring light-intensity PA or SB because lower-intensity

activities tend to elicit high HR variability (Spurr, Prentice et al. 1988). Furthermore, HR is

influenced by a number of external factors such as stress, caffeine intake, and temperature, which

affect HR during SB and LPA much more than MVPA (Montoye, Kemper et al. 1996; Crouter,

Albright et al. 2004). Finally, HR monitors can be cumbersome to wear, which may lower

compliance rates compared to accelerometers or pedometers (Janz 2002; Andre and Wolf 2007).

Accelerometers have become the preferred device for measuring EE, PA, and SB due to

their objectivity, minimal participant and researcher burden, and ability to measure free-living

activity for several weeks at a time. Accelerometers work by recording accelerations of a single

part of the body and using this information to predict EE or activity type. Traditionally, these

accelerations were passed through a filter to remove aberrant signals and then translated into

‘activity counts’ corresponding to the magnitude of the acceleration.

In most studies, accelerometers have been worn on the hip to record vertical accelerations

of the trunk; these vertical accelerations were found to correlate well with EE for ambulatory

activities, such as walking and running (Montoye, Washburn et al. 1983; Freedson, Melanson et

al. 1998). However, hip-mounted accelerometers have limited accuracy for measuring EE for

Page 41: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

26

SB and many lifestyle activities (e.g., household chores, gardening, climbing/descending stairs,

and cycling) and, when using common linear-regression or cut-point methods, cannot classify

activity type (Hendelman, Miller et al. 2000; Crouter, Churilla et al. 2006; Rothney, Schaefer et

al. 2008; Lyden, Kozey et al. 2011). Recently, accelerometer battery and memory capacity have

improved to allow measurement of three-dimensional, raw acceleration data for as long as

several months at a time (Westerterp 1999). Following these technological improvements, data

processing methods such as machine learning have emerged as superior methods for analyzing

accelerometer data.

The following sections will review the progression of accelerometer data processing

techniques leading up to the present time, address current limitations in data processing, and

discuss how the current study will improve EE, PA, and SB measurement.

The Large-Scale Integrated monitor and Caltrac

In 1979, Laporte et al. (LaPorte, Kuller et al. 1979) developed the Large-Scale Integrated

(LSI) motor activity monitor for the measurement of EE. The LSI was a little bit larger than a

wrist-watch and contained a ball of mercury housed in a small cylinder. When the LSI was

moved, the mercury would roll down the cylinder and run into a mercury switch. The number of

times the switch was contacted was displayed on a small screen. In this way, the LSI functioned

like a pedometer but was intended for use on the hip and other parts of the body (i.e., ankle,

wrist). To assess the LSI’s ability to measure EE, Laporte et al. designed a series of experiments

where they had participants log their activity for two days while wearing the LSI on the hip and

ankle. Activities in the activity logs were looked up in previously developed EE tables (that

reported average EE required for each activity) to obtain a measure of total EE, and these EE

Page 42: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

27

values were correlated to the output from the LSI monitors. While both the hip- and ankle-

mounted monitors had positive correlations with EE, the hip monitor performed significantly

better (r=0.69) than the ankle monitor (r=0.43). This study provided a first step in validating

activity monitors, but it used a poor criterion measure by estimating EE from tables instead of

directly measuring EE and did not compare this monitor to other activity measures in use at the

time (e.g., pedometer).

In 1981, Wong et al. (Wong, Webster et al. 1981) developed an accelerometer that was

later commercially produced as the Caltrac (Hemokinetics, Inc., Madison, WI). The Caltrac had

a piezoelectric sensor which recorded accelerations based on the output charge generated with a

movement, with faster accelerations producing a greater charge. This method provided a

significant advantage over the pedometer, which could only record steps and could not

differentiate different speeds of movement. The Caltrac was worn on the hip or lower back and

recorded total vertical accelerations accrued, allowing it to measure total EE over the time period

it was worn. In two different laboratory experiments, Montoye et al. showed that the Caltrac had

higher correlations with measured EE than other activity monitors. In the first experiment

(Wong, Webster et al. 1981), 15 participants performed walking (at 2, 3, and 4 mph), running (at

6 and 8 mph), and stepping (80, 120, and 160 steps/min) for three minutes each. During the

testing the participants wore the Caltrac, two different pedometers, and a metabolic analyzer;

they found that the Caltrac had significantly higher correlations with measured EE than either of

the pedometers (data displayed in a figure, but no exact correlation coefficients given). Next,

they conducted a second study (Montoye, Washburn et al. 1983) where 21 adults performed level

and inclined walking (2 and 4 mph at 0, 6, and 12% grades), level and inclined running (6 mph at

0 and 6% grades), stepping (20 and 35 steps/min), knee-bends (28 and 48 bends/min), and floor

Page 43: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

28

touches (24 and 36 touches/min) for four minutes each. During the activities, participants wore

the Caltrac on the hip, two LSI activity monitors (worn on hip and wrist), and a metabolic

analyzer. Similar to the previous study, the Caltrac had significantly higher correlations (r=0.79

vs. r=0.71 and r=0.40) and lower standard errors (S=6.63 vs. S=7.86 and S=9.16 ml/kg/min) for

EE measurement than the hip- and wrist-worn LSI monitors.

These studies provided the first evidence of the utility of accelerometers for EE

measurement over pedometers or other kinds of activity monitors. Additionally, the comparison

of the hip-mounted LSI to the wrist- and ankle-mounted LSIs provided preliminary evidence that

the hip placement for activity monitors was preferable to limb placement when EE was the

outcome variable of interest. However, use of these early monitors was restricted to measuring

total EE and could not yield information about activity type, duration, or intensity.

Linear regression

For almost 15 years, the Caltrac was the most commonly used accelerometer for EE

measurement in both adults and children (Sallis, Buono et al. 1990; Haymes and Byrnes 1993).

Then, in the mid-1990s, newer accelerometers such as the Tritrac and the Computer Science

Applications (CSA, also called the ActiGraph 5032) were developed, tested, and validated for

measuring EE in a number of different studies (Janz, Witt et al. 1995; Melanson and Freedson

1995; Welk and Corbin 1995).

In 1998, accelerometer data processing took a large step forward with a study by

Freedson et al. (Freedson, Melanson et al. 1998) which was the first to use accelerometer data to

measure PA intensity as well as EE using the uniaxial CSA 7164 accelerometer (a newer version

of the CSA 5032). Their study was a laboratory-based validation study where 50 adult

Page 44: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

29

participants walked (3.0 and 4.0 mph) and jogged (6.0 mph) on a treadmill, for six minutes each,

while wearing a metabolic analyzer and a CSA on the right hip. Accelerometer counts and EE

data were collected at one-minute intervals, allowing minute-by-minute comparisons of counts

and EE. Accelerometer counts were found to have high correlations (r=0.88) with EE, allowing

a linear regression model to be created to predict EE (in METs) from accelerometer counts.

Furthermore, activity intensity could be derived from METs by establishing count thresholds

(cut-points) to classify PA into light (<3.0 METS, <1952 counts/min), moderate (3.0-5.9 METs,

1952-5724 counts/min), hard (6.0-8.9 METs, 5725-9498 counts/min), and very hard (≥9.0

METs, ≥9499 counts/min) intensities.

Since Freedson et al. published their study, the development of cut-points to classify

activity intensity has been the preeminent method for validating and using accelerometers. Cut-

point development is relatively simple for researchers to accomplish and understand, and the cut-

point approach seems to work relatively well for measurement of ambulatory activities

(Freedson, Melanson et al. 1998; Lyden, Kozey et al. 2011). However, a significant limitation of

a linear regression equation developed using ambulatory activities is that it does not predict EE

well when non-ambulatory activities are performed. Hendelman et al. (Hendelman, Miller et al.

2000) designed a free-living simulation that involved four self-selected speeds of walking

(ambulatory activities), household chores (washing windows, dusting, vacuuming, lawn mowing,

and planting shrubs), and two holes of golf. During the session, participants wore the CSA on

the hip and had EE measured with a portable metabolic analyzer. Two regression equations were

then developed, one for the walking activities (the “calibration” regression equation) and one for

all activities. Similar to Freedson’s equation, Hendelman’s calibration regression equation

Page 45: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

30

performed significantly better for predicting EE during the walking activities (r=0.77) than for all

activities (r=0.59), and underestimated EE by 30.5-56.8% in the free-living simulation.

From Hendelman’s study, it is apparent that linear regression models developed using

ambulatory activities perform much better when measuring EE during ambulatory activities than

for non-ambulatory activities. To support this finding, the regression equation and cut-points

developed by Freedson et al. (Freedson, Melanson et al. 1998) have been studied by Crouter et

al. (Crouter, Churilla et al. 2006), Lyden et al. (Lyden, Kozey et al. 2011), and Rothney et al.

(Rothney, Schaefer et al. 2008); these studies support Hendelman’s conclusion that Freedson’s

regression equation significantly underestimates EE when applied to non-ambulatory activities.

In order to improve on the shortcomings of EE regression equations developed using only

ambulatory activities, researchers began developing equations using both ambulatory and non-

ambulatory activities. In 2000, Swartz et al. (Swartz, Strath et al. 2000) developed a linear

regression equation using 28 activities, of which only two were ambulatory activities (walking at

2.9 and 3.7 mph) and the remaining 26 were non-ambulatory, lifestyle activities (e.g., sports such

as tennis and softball, household chores such as cooking and laundry). Their regression equation

yielded only moderate validity (r=0.56) for EE measurement, but the studies by Crouter et al.

(Crouter, Churilla et al. 2006), Lyden et al. (Lyden, Kozey et al. 2011), and Rothney et al.

(Rothney, Schaefer et al. 2008) confirmed that Swartz’s regression model had better validity for

measuring MVPA in free-living settings than Freedson’s.

One of the big differences between the cut-points developed by Freedson and those

developed by Swartz is that to overcome the underestimation of lifestyle activities, Swartz had a

much lower cut-point for MVPA than Freedson (574 counts/min from Swartz’s equation vs.

Page 46: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

31

1952 counts/min from Freedson’s equation). However, because of the lower MVPA cut-point,

the regression line had a much flatter slope so that the EE of more intense activities would not be

overestimated. Thus, Swartz’s equation had a y-intercept at 2.606, meaning that the predicted

MET value for an activity registering 0 counts/min was 2.606 METs. This is a substantial error

given that activities likely to record 0 counts/min (such as lying or sitting) elicit EE values of 1.0

METs (Ainsworth, Haskell et al. 2011). Thus, Swartz et al. improved the measurement of

MVPA at the expense of measuring SB and LPA.

In summary, the available evidence suggests that no simple linear regression model

successfully classifies all activity intensities or accurately predicts EE across a variety of

activities. As evidenced by Swartz’s study, improvement of certain PA intensities hurts

measurement in the other intensities. Although the linear regression model is simple to

understand and use, more complicated methods of accelerometer data processing are necessary

to improve measurement of EE and activity intensity and classify activity type.

Multiple regression

Once it became apparent that single linear regression models could not adequately

measure EE or activity intensity across a range of activity types, researchers moved toward

creating more complex, multiple regression equations in an attempt to improve activity

measurement. Heil (Heil 2006) was the first to experiment with a model where EE was

predicted using one of two independent, linear regression models that had different slopes.

Model 1 was used for activities eliciting 350-1,200 counts/min, and model 2 was used for

activities eliciting >1,200 counts/min. Model 1 had a steeper slope than model 2, and this steeper

slope helped predict EE of non-ambulatory activities (which were often underestimated by single

Page 47: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

32

regression equations) more accurately while reducing the overestimation of light-intensity

activities. By utilizing this approach, Heil was able to significantly improve estimates of EE

over single regression models (r=0.84 for single regression model, r=0.87 for model 1 of two-

regression model, and r=0.92 for model 2 of the two-regression model) when predicting EE

across 10 activities (7 lifestyle, 3 ambulatory). Additionally, Heil was among the first to set a

threshold for SB; in his model, any activity eliciting <350 counts/min was assigned an EE value

of 1.0 MET instead of being input into a prediction equation. This sedentary threshold was

implemented to alleviate the significant overestimation of EE present in single regression

models.

Despite the improvements in EE measurement seen with Heil’s two regression model, the

model had limited use because it was fully dependent on accelerometer counts and cut-points to

estimate EE. Sole reliance on counts and cut-points for determination of EE is a problem

because some activities with different EE requirements yield similar numbers of counts when

measured with a hip-mounted accelerometer. For example, Lyden et al. (Lyden, Kozey et al.

2011) found that activities can elicit very different counts/min (e.g., 3,245 counts/min for

descending stairs vs. 203 counts/min for raking) while having very similar EE requirements (5.0

METs for descending stairs vs. 5.2 METs for raking). Thus, using Freedson’s regression

equation (Freedson, Melanson et al. 1998), descending stairs would be correctly classified as

moderate-intensity PA , while raking would by incorrectly classified as SB or light-intensity PA.

Similarly, Hendelman et al. (Hendelman, Miller et al. 2000) found that some activities can elicit

similar counts/min (e.g., 1,982 counts/min for walking 2.0 mph vs. 2,144 counts/min for golfing)

but elicit very different EE requirements (2.0 METs for walking 2.0 mph vs. 4.3 METs for

golfing). In this example, walking and golfing would both be classified as moderate-intensity

Page 48: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

33

PA based on counts, even though walking at 2.0 mph is actually LPA. As one final example,

both lying and standing quietly elicit close to 0 counts/min, so EE prediction would be the same

for each (1.0 METs). However, lying requires an EE of 1.0 METs, while standing requires 1.3

METs (Ainsworth, Haskell et al. 2011); while the absolute difference is only 0.3 METs,

misclassifying standing as lying is a 30% error in EE. This error is especially significant

considering that adults in the US spend about 60% of their day engaged in SB (Matthews, Chen

et al. 2008), highlighting the need to be able to detect small differences in EE that exist among

different types of SB and LPA.

In contrast to Heil’s method of choosing the regression line based on counts/min, Crouter

et al. (Crouter, Clowers et al. 2006) developed a two regression model where the regression line

used was dependent on the variability of the activity being performed. They discovered that

variability in counts for ambulatory activities is lower than the variability of non-ambulatory,

lifestyle activities (which tend to be more intermittent in nature). In order to determine

variability, they parsed the one-minute data into six 10-second segments and calculated the

coefficient of variation (CV) for the minute. Then they developed a two regression model where

activities with a CV of ≤10 were analyzed using an exponential regression curve developed for

ambulatory activities, and those with a CV of >10 were analyzed with a cubic regression curve

developed for non-ambulatory activities. They chose exponential and cubic curves because these

fit their data better than linear regression lines. When Crouter et al. tested their model in 48

participants performing 17 activities (4 ambulatory, 13 lifestyle), their model showed greatly

improved accuracy for measuring METs ( r=0.96) compared to Freedson’s, Hendelman’s, and

Swartz’s (Swartz, Strath et al. 2000) linear regression models, where the highest correlation was

r=0.70. A subsequent study by Crouter et al. (Crouter and Bassett 2008) produced a two

Page 49: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

34

regression model for the Actical accelerometer, and they found similar improvements in EE

compared to single, linear regression.

Crouter’s model was a significant innovation for two reasons: 1) it was the first to use

characteristics of accelerometer output (CV) other than counts for EE measurement and 2) it was

the first to utilize a non-linear regression model, which is less restrictive than a linear regression

line since it allows more freedom in fitting a relationship between EE and accelerometer output

across both ambulatory and lifestyle activities. However, subsequent evidence suggests that

Crouter’s model may suffer from over-fitting, where the extra freedom of the non-linear model

allowed for construction of a more accurate model for a specific population (better internal

validity) but a less accurate model when applied to other populations, data sets, or sets of

activities (poorer generalizability). To demonstrate this point, Lyden et al. (Lyden, Kozey et al.

2011) tested Crouter’s model against the models of Freedson, Hendelman, and Swartz in a large

(n=277), independent sample performing 23 activities (6 ambulatory, 17 lifestyle). While

Crouter’s model performed best for lifestyle activities, both Freedson’s and Swartz’s linear

models performed better for ambulatory activities. Therefore, while Crouter’s models did not

solve the problem of accurately measuring both ambulatory and lifestyle activities, they showed

the utility of using accelerometer features other than counts/min to distinguish different kinds of

activities and improve measurement of EE.

Measurement of sedentary behavior using accelerometers

As mentioned previously, self-report has been largely inadequate for measuring SB,

leading researchers to use objective measures for SB measurement. Accelerometers are

seemingly ideal for the measurement of SB because they can capture both movement and non-

Page 50: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

35

movement and, therefore, should be able to measure total SB as well as detect breaks in SB.

Using NHANES data to determine population levels of SB, Matthews et al. used a count cut-

point of <100 counts/min to determine SB and found that the average adults spends about 7.7

hrs/day in SB (Matthews, Chen et al. 2008). Since its first use, the 100 counts/min cut-point has

been used by Healy et al. (Healy, Dunstan et al. 2007; Healy, Wijndaele et al. 2008; Healy,

Matthews et al. 2011), who has found consistent associations between objectively-measured SB

and poor metabolic health. However, the 100 counts/min cut-point for SB was chosen for its

utility in detecting non-movement and has not been validated for use as an accurate measure of SB

(Pate, O'Neill et al. 2008). Additionally, standing quietly elicits less than 100 counts/min, so the

cut-point approach incorrectly classifies standing as SB when, as discussed previously, it exerts

different effects on health and must be distinguished from SB.

To directly test SB cut-points, a 2011 study by Kozey-Keadle et al. (Kozey-Keadle,

Libertine et al. 2011) had 20 adult office workers wear ActiGraph accelerometers for six hours on

two work days, with the second day spent performing more PA and less SB In this study, DO was

used as the criterion measure of SB. Using the 100 counts/min cut-point, the ActiGraph

underestimated time in SB by 4.9% and could not detect the change in SB that occurred between

the first and second days. Notably, 150 counts/min was identified as producing a more accurate

estimation of total SB (underestimated total SB by 1.8%), although it also could not detect the

change in SB. In a follow-up study, Lyden et al. tested the utility of the 100 counts/min and 150

counts/min SB cut-points by having 13 adults wear ActiGraph accelerometers for 10 hours on two

separate days, while performing less SB and more breaks in SB on the second day. The

investigators found that both the 100 and 150 counts/min cut-points led to overestimations of SB,

overestimations of breaks in SB, and inadequate detection of the reduction in SB on the second

Page 51: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

36

day. Additionally, while the 100 counts/min cut-point was more accurate for total SB, the 150

counts/min cut-point was more accurate for breaks in SB. Together, these two studies indicate

three issues: 1) defining a cut-point for SB is problematic since there does not appear to be a single

one that can accurately measure total SB or breaks in SB, 2) no cut-point has been shown to

accurately detect changes in SB, and 3) a consistent over- or under-estimation may be correctable,

but the previous studies show no such consistent pattern (one showing underestimation and one

showing overestimation). Therefore, measuring SB using the cut-point approach is not sufficient

to capture the complex nature of SB, and the cut-point approach also cannot be used to identify

specific activity types or distinguish standing from SB.

Machine learning

Regression techniques and cut-points were a logical first step in EE prediction due to

their intuitive appeal and simplicity. The progression of regression equations from those

developed for the Caltrac to Crouter’s multiple regression models for the ActiGraph and Actical

has demonstrated that while newer regression models can address many of the problems of older

models, the newer models also create new limitations. Given that two activities of very different

intensities can elicit the same number of counts/min (Lyden, Kozey et al. 2011) and the fact that

counts/min does not yield enough information to classify activity type, it is critical to move away

from relying solely on accelerometer counts for activity measurement. Additionally, Crouter’s

(Crouter, Clowers et al. 2006; Crouter and Bassett 2008) method of using the CV of an activity

to differentiate ambulatory from lifestyle activities indicates that within the counts/min output is

rich information in the accelerometer signal, and this information may be used for improving EE

measurement and also allowing for classification of activity type.

Page 52: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

37

Using average accelerometer counts/min to estimate EE was originally done as much for

practical reasons as for scientific reasons. The Caltrac worked similarly to a pedometer in that it

could only record total counts to give an overall estimation of EE or total PA level (Wong,

Webster et al. 1981). The CSA represented a vast improvement in technology in that it could

aggregate counts into one-minute increments, allowing minute-by-minute estimations of EE and

estimations of PA intensity based on the EE in a given minute. Both accelerometers were

uniaxial and could only record accelerations in the vertical plane. Further improvements in

accelerometer technology have made the monitors smaller, lighter, cheaper, and able to record

triaxial accelerations at a rate of up to 100 times per second (100 Hz) and for upwards of 45 days

at a time on a single battery charge (GENEActiv 2013). This expansion in monitor capabilities

has led to advanced methods of data managing and processing, collectively called “machine

learning,” that have been used to predict EE and classify activity type and which show great

promise for improving measurement of SB.

Machine learning is a term that describes an array of complex mathematical techniques

and algorithms that, coupled with an appropriate software package, can learn to recognize and

differentiate patterns in activities by examining certain input ‘features,’ which are summaries of

the data (e.g., mean, standard deviation, or skewness of the acceleration signal). Thus, machine

learning can be applied to accelerometer data to in order to estimate EE, classify activity type

and possibly measure SB (Preece, Goulermas et al. 2009). In order to use machine learning,

important features of the accelerometer data must be identified and extracted for use. Then,

these features can be used as inputs (or independent variables) into a machine learning algorithm,

which then provides a specific output (dependent variable), such as EE or activity classification.

Some features, such as root mean square error (RMSE) and CV, can be useful for differentiating

Page 53: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

38

between static and dynamic activities (e.g., sitting vs. walking). Others, such as monitor

orientation, help differentiate different body postures (e.g., lying vs. standing). Conversely,

features such as mean, standard deviation, and entropy of the acceleration signal are useful for

distinguishing among dynamic activities and activity intensities (Preece, Goulermas et al. 2009).

Many machine learning algorithms exist, but those that have been used most commonly

in PA research include artificial neural networks (ANNs), hidden Markov models, and decision

trees (Bao and Intille 2004; Pober, Staudenmayer et al. 2006; Preece, Goulermas et al. 2009;

Staudenmayer, Pober et al. 2009; Freedson, Lyden et al. 2011). While there is no consensus on

which machine learning technique is most accurate for PA measurement, the ANN technique has

a number of advantages over the other technique: 1) the ability of ANNs to directly estimate

continuous and categorical variables and 2) the ability to construct ANNs using freely-available

software. First, a significant limitation of the decision tree and hidden Markov model is that they

can directly predict categorical variables, such as activity type, but cannot directly predict

continuous variables such as EE (Preece, Goulermas et al. 2009). Decision trees and hidden

Markov models can estimate EE indirectly by first classifying activity type and then predict EE

using values from the Compendium of Physical Activities (Ainsworth, Haskell et al. 2011), but

this method is limited to predicting EE from only the activities the decision trees and hidden

Markov models were trained to classify and is subject to the same limitations for measuring EE

as when using the Compendium (i.e., different people may have different EE when performing

the same activities, EE values are averages, etc.). Second, while many machine learning

techniques must be conducted using complicated and expensive software packages (Pober,

Staudenmayer et al. 2006; Rothney, Neumann et al. 2007; Preece, Goulermas et al. 2009), a 2009

study by Staudenmayer et al. (Staudenmayer, Pober et al. 2009) implemented a relatively simple

Page 54: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

39

way to use freely available software, the R statistical software , to extract features and process

accelerometer data using ANNs. Thus, their study offered a significant advancement in the field

because it was the first to make a complicated machine learning technique accessible to

researchers without extensive engineering, computer science, and/or mathematics backgrounds

or those without access to expensive statistical software packages.

In many respects, machine learning is similar to regression. First, ANNs, as well as other

machine learning techniques, work by taking a set of input variables (e.g., accelerometer counts,

raw acceleration data, monitor orientation, demographic variables) and using them to predict a

certain output (e.g., EE, activity type). Then, in order to create an ANN, the ANN must first be

calibrated or trained on a set of data where both the inputs and outputs are known. The ANN

then assigns certain weights to the input variables based on how important they are for predicting

the output (similar to coefficients in regression equations) (Preece, Goulermas et al. 2009).

However, ANNs are different from regression in two important ways. First, ANNs do

not assume that simple models can be fit to complex data (derived in a variety of settings and

from many different activities) (Preece, Goulermas et al. 2009). Thus, an ANN is much more

flexible than a regression model because it does not have to have some predetermined shape

(e.g., a line for a linear model or a curve for a quadratic model). Second, ANNs can take input

variables that contain much more information about an activity than minute-by-minute

accelerometer counts. For example, Staudenmayer’s model took second-by-second, uniaxial

(vertical axis) accelerometer count data and extracted the 10th

, 25th

, 50th

, 75th

, and 90th

percentiles

from each minute’s data as the features to use as inputs into the ANN. By extracting these

percentiles, it is possible to derive information about the average, variance, and CV of the

accelerometer data. Thus, the model being created is using much more information from the

Page 55: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

40

accelerometer, which should make it more accurate in predicting the desired outcome variables.

Using this approach, Staudenmayer et al. were able to improve EE estimates by 28-66%

compared to linear and multiple regression models. Additionally, while regression cannot be

used for activity type classification, Staudenmayer’s model correctly classified activities into

four different types (sedentary, lifestyle, ambulatory, or sport) with 88% accuracy

(Staudenmayer, Pober et al. 2009). A more detailed description of the ANN is offered in the

Methods sections of Chapters 3-5 of this dissertation.

Although Staudenmayer’s use of machine learning for predicting EE is a significant step

forward, newer accelerometer models offer raw data recording in three axes and also provide

information about monitor orientation. Since accelerometer counts are derived from proprietary

filtering methods by the companies that manufacture each kind of accelerometer, use of

accelerometer counts does not allow for comparability of different brands of accelerometer. The

move to raw data collection and analysis allows for comparison between accelerometer models.

Also, more useful information can be extracted from the raw accelerometer data than from

activity counts, so use of raw data will likely improve the use of ANNs for EE and SB

measurement and activity classification.

Additionally, while Staudenmayer et al. were able to classify activity into four categories,

activity measurement will be significantly enhanced with proper identification of more specific

activity types and a more thorough classification of SB (e.g., identifying sitting and standing

separately instead of grouping them as ‘sedentary’). Thus, the current study will build off of

Staudenmayer’s research by including a slightly larger number of input features and raw data in

order to further improve measurement of EE and SB and classification of more activity types.

Page 56: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

41

Multiple sensor methods

Given the limitations of single, hip-mounted accelerometers for measuring the wide

variety of activities that occur in free-living settings, some researchers have used multiple

sensors to improve EE measurement and classify activity type. These efforts generally fall into

one of two categories: 1) utilizing monitors that collect acceleration data along with other

physiologic variables (e.g., HR, skin temperature) or 2) use of multiple accelerometer-based

monitors placed on different parts of the body. Both types will be discussed in the following

text.

First, the combination of accelerometry and physiologic measures has been used to

improve EE and activity intensity measurement. HR and accelerometry are both popular

methods of measuring PA intensity and EE, but both have notable limitations when used on their

own. In an effort to minimize the limitations of each method while capitalizing on their

strengths, researchers have developed regression methods which use both HR and accelerometer

counts to predict EE. Haskell et al. (Haskell, Yee et al. 1993) were the first to use a combination

of HR and movement data to try to improve EE estimation. The authors had 19 men perform

seven ambulatory and exercise activities while wearing a HR monitor, two Vitalog activity

monitors (on the wrist and thigh), and a metabolic analyzer. Overall, using both HR and body

motion significantly improved overall EE estimation compared to using HR or accelerometry

only, although much of the improvement came from using individualized HR-EE curves (as

opposed to a general curve applied to all participants). Follow-up studies have also shown

improvements in EE prediction with combined HR and accelerometer data (Moon and Butte

1996; Strath, Bassett et al. 2001; Strath, Bassett et al. 2002; Plasqui and Westerterp 2005), but

these studies also used individual calibration curves for HR, dramatically increasing researcher

Page 57: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

42

burden for accurate data collection and limiting the generalizability of the regression models to

the participants from whom they were created. The Actiheart activity monitor (Phillips, Bend,

OR) attempts to reduce burden of multiple monitors by combining accelerometer and HR

monitor into one device, which is fastened to a person’s chest with a sticky pad for continuous

wear. The Actiheart tends to have good wear compliance, but men have to shave their chest to

wear the monitor, and women tend to report lower comfort with the Actiheart than with hip-

mounted monitors (Moy, Sallis et al. 2010). Thus, while using both HR and accelerometer

counts seems to improve EE measurement, the added cost and burden to researchers in creating

individual HR curves and using multiple monitors per participant prohibits the use of this method

in large studies. Additionally, participant compliance with HR monitors tends to be lower than

with self-report or accelerometer tools (Janz 2002; Andre and Wolf 2007), providing another

limitation of their use for activity measurement.

Another measurement device, the BodyMedia armband (formerly called the Sensewear

armband; BodyMedia, Inc., Pittsburgh, PA), is a single monitor (worn on the upper arm) that

records biaxial acceleration data as well as heat flux, galvanic skin response, skin temperature,

and ambient temperature. The armband uses these variables, along with self-reported gender,

age, height, and weight to predict EE through proprietary algorithms developed by BodyMedia.

The armband was first validated by Jakicic et al. (Jakicic, Marcus et al. 2004) in 2004 for

estimating EE from walking, stepping, and leg and arm ergometry in 40 adults; their results

indicate that the armband provided much better estimation of EE than a hip-mounted TriTrac

accelerometer for these four exercise activities. Further research on the armband has validated

its use for estimating exercise and free-living EE in many populations, including children

(Arvidsson, Slinde et al. 2007), younger and older adults (Welk, McClain et al. 2007;

Page 58: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

43

Heiermann, Khalaj Hedayati et al. 2011), pregnant women (Berntsen, Stafne et al. 2011), and

diseased or obese individuals (Mignault, St-Onge et al. 2005; Papazoglou, Augello et al. 2006;

Dwyer, Alison et al. 2009). In studies comparing the armband to traditional accelerometers for

EE measurement, the armband frequently performs similarly to whe hip-mounted accelerometer

data that are analyzed with linear regression models (Jakicic, Marcus et al. 2004; Welk, McClain

et al. 2007; Berntsen, Hageberg et al. 2010; Colbert, Matthews et al. 2011), although some

research indicates reductions in error by as much as 20% using the armband (Lee, Kim et al.

2014). Therefore, it is possible that the addition of physiologic measures improves estimates of

EE. Additionally, the armband’s skin temperature and heat flux sensors help to verify time the

monitor is actually being worn (wear time), which is an important issue in accelerometry-based

measurement (Masse, Fuemmeler et al. 2005; Evenson and Terry 2009).

Despite these advantages, the armband has some key limitations that prevent it from

being an optimal measurement tool. The armband’s primary limitation is that it estimates EE

using BodyMedia’s proprietary algorithms. While proprietary algorithms can be useful to

consumers and end users who want EE estimation or time spent in MVPA without needing to

develop their own prediction model, proprietary algorithms hinder scientific progress because

they do not allow researchers transparency as to how EE is being predicted or which input

variables are most useful for EE prediction. Without this knowledge, it becomes very difficult to

identify armband strengths and limitations or identify variables that might be used to further

improve EE measurement. Additionally, BodyMedia constantly refines its prediction algorithms

to improve EE measurement, but without knowing how the algorithms work or which variables

are most important, it is very difficult to compare results obtained using the different algorithms,

hindering generalizability or comparability of study results. Finally, the armband only provides

Page 59: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

44

estimates of EE and activity intensity, and its proprietary data analysis prohibits researchers from

accessing the raw data in order to be able to use newer data processing techniques to determine

activity type.

Another multi-sensor method researchers have studied is to use multiple accelerometers

positioned on different parts of the body to improve EE measurement and activity classification.

When creating their linear regression model for measuring EE of lifestyle activities (discussed

earlier), Swartz et al. (Swartz, Strath et al. 2000) also had participants wear a CSA on the wrist to

determine if using acceleration information from the hip and wrist locations simultaneously

could improve EE measurement. Compared to the correlation of r=0.56 for the hip regression

equation, the combination of hip and wrist acceleration improved the correlation only minimally,

to r=0.59. The minimal improvement seen in this study does not seem worth the added burden

on participants (due to compliance issues) or researchers (for the added data to be analyzed).

More recently, researchers have built and tested systems of accelerometers, where each

accelerometer has a wired or wireless link to a central unit, allowing the unit to process data from

the accelerometers simultaneously. A complete comparison of the systems can be found in

Table 2.1. One example of this is the Intelligent Device for Energy Expenditure and Activity

(IDEEA; MiniSun, Fresno, CA). Produced in the early 2000s, it is a system of five

accelerometers (worn on both feet, both thighs, and the chest) that are wired to a processing unit

worn on the hip. The sensors are taped to the skin, and the wires are to be worn underneath

clothing to minimize risk of breaking. Data collected from the IDEEA monitor are processed via

proprietary algorithms and are used to predict EE, activity type, activity duration and intensity,

and activity speed (for walking and running). Validation studies have shown 98.7% accuracy for

classifying 32 activities and postures (Zhang, Werner et al. 2003) and a correlation of r=0.973 for

Page 60: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

45

measuring EE during a simulated free-living setting (Zhang, Pi-Sunyer et al. 2004). Its high

accuracy for measuring both activity type and EE makes it ideal as a criterion measure in short-

term, free-living studies (Welk, McClain et al. 2007; Gyllensten and Bonomi 2011). However,

the IDEEA’s limited battery life (48 hours), excessive participant burden, fragile design,

proprietary algorithms for predicting its outcome variables, and high cost ($5,000 per unit)

prohibit its use as a measurement tool for large, free-living studies.

Since the creation of the IDEEA, another system has been developed by Tapia et al. that

uses five wireless sensors (placed on the right ankle, thigh, hip, wrist and upper arm), a heart rate

monitor, and open-source data analysis to overcome many of the shortcomings of the IDEEA

monitor. Using machine learning algorithms, Tapia et al. were able to classify 30 activities and

postures with 56.3% accuracy with only accelerometer data and 58.4% using accelerometer and

HR data (Tapia, Intillie et al. 2007). While their classification accuracy was much lower than the

IDEEA system, the activities that Tapia et al. used were more similar to each other than the

activities in the IDEEA validation, and the activities Tapia’s system misclassified were often

different intensity levels of a given activity (e.g., cycling hard intensity at 30 rpm vs. cycling

moderate intensity at 30 rpm). Importantly, Tapia’s study indicates that HR may have little

value for classifying activity type, especially when using advanced processing techniques for

accelerometer data. In a similar study, Dong et al. developed a wireless system of three

accelerometers (worn on the right ankle, thigh, and wrist) for measurement of activity type and

EE. In a validation of the system, they found that activity classification accuracy for 14 activities

was 71.3-78.3% using only one accelerometer (with the thigh providing the best classification

accuracy and the wrist providing the lowest) but improved to 89.6-96.2% using two

accelerometers (with the ankle and wrist combination providing the highest classification

Page 61: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

46

accuracy) and only improved slightly (to 97.0%) using all three accelerometers (Dong, Montoye

et al. 2013). The improvement in classification accuracy of this system compared to Tapia’s

may be due to using fewer activities in the validation, but it may also be due to difference in

machine learning approach and features used as input variables. Despite dramatic improvements

in activity type classification accuracy achieved when using multiple accelerometers, preliminary

analyses with the system developed by Dong et al. indicate that use data from all three

accelerometers provides only minimal improvement over use of a single monitor for EE

measurement (Dong, Biswas et al. 2013; Montoye, Dong et al. 2013).

Overall, inclusion of physiologic variables and/or additional accelerometers appears to

improve measurement of activity type classification and possibly EE, but it is unclear which

variables or monitor locations are most useful to be included. Inclusion of HR does not appear to

improve activity classification (Tapia, Intillie et al. 2007). Also, using two or more

accelerometers can markedly improve accuracy of activity classification (Zhang, Werner et al.

2003; Dong, Montoye et al. 2013) but may not be as useful for EE measurement (Metcalf,

Curnow et al. 2002; Zhang, Pi-Sunyer et al. 2004; Dong, Biswas et al. 2013; Montoye, Dong et

al. 2013). However, the added burden of measuring additional variables restricts the use of these

technologies and methods to small, short-duration studies. To help ensure high compliance rates,

reduce both researcher and participant burden, and allow for accurate measurement in large

studies, development of accurate measurement techniques for single accelerometers is needed.

Page 62: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

47

Table 2.1. Comparison of wireless accelerometer systems for activity classification accuracy and EE prediction accuracy.

Study Participant

characteristics

Placement of

monitors

Number and types of

activities

Activity

classification

accuracy

EE prediction

accuracy

Dong et al.

(Dong, Montoye

et al. 2013)

40 adults Right wrist, thigh,

and ankle

14 sedentary,

ambulatory, lifestyle,

and exercise activities

(11 distinct, 3

variations). Laboratory-

based protocol.

All 3 monitors:

97.0%

Ankle an wrist:

96.2%

Thigh and wrist:

91.0%

Ankle and thigh:

89.6%

Thigh: 78.3%

Ankle: 78.3%

Wrist: 71.5%

N/A

Tapia et al.

(Tapia, Intillie

et al. 2007)

21 adults Accelerometers:

Right wrist, upper

arm, thigh, hip,

and ankle

HR monitor:

Chest

30 gymnasium

activities (13 distinct,

17 variations).

Laboratory-based

protocol.

Without HR:

56.3%

With HR: 58.4%

N/A

Zhang et al.

(Zhang, Werner

et al. 2003)

68 adults IDEEA: Right and

left foot, right and

left thigh, chest

32 activities (5 distinct,

22 variations, and 5

limb movements).

Combination of

laboratory -based and

simulated free-living

protocols.

98.7% N/A

Page 63: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

48

Table 2.1 (cont’d.)

aDong et al.

(Dong, Biswas

et al. 2013);

bMontoye et al.

(Montoye, Dong

et al. 2013)

25 adults Right wrist, thigh,

and ankle

14 sedentary,

ambulatory, lifestyle,

and exercise activities

(11 distinct, 3

variations). Simulated

free-living protocol.

N/A a3-monitor

system similar

to or better than

hip for 10 of

14 activities

bCorrelations

(r):

3-monitor

system: 0.81

Thigh: 0.80

Ankle: 0.79

Wrist: 0.74

RMSE

(METS):

3-monitor

system: 1.61

Thigh: 1.61

Ankle: 1.69

Wrist: 1.85

Zhang et al.

(Zhang, Pi-

Sunyer et al.

2004)

37 adults IDEEA: Right and

left foot, right and

left thigh, chest

Lab-based: 11

activities (5 distinct, 6

variations)

Simulated free-living:

2 required (walking and

running), and the rest

were left up to

participant

N/A Lab-based:

98.9%

Simulated

free-living:

95.1%

Page 64: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

49

Accelerometer placement

Accelerometers can be placed anywhere on the body in order to record movement of the

head, limbs, torso, etc. From their first use in PA measurement, accelerometers were placed on the

hip to measure whole-body movement, and preliminary studies showed the hip placement to have

higher correlations with measured EE compared to wrist or ankle placements (LaPorte, Kuller et

al. 1979; Montoye, Washburn et al. 1983). Additionally, validation and cross-validation studies

(Freedson, Melanson et al. 1998; Lyden, Kozey et al. 2011; Sasaki, John et al. 2011) show high

correlations of hip-mounted accelerometer counts to EE during ambulatory activities, which

comprise a high percentage of the types of PA in which people engage (Ham, Kruger et al. 2009).

Despite the common use of hip-mounted accelerometers for measuring PA, there are many notable

limitations associated with their use. First and foremost, when count-based regression equations

are used, hip-mounted accelerometers dramatically underestimate the EE associated with lifestyles

activities while overestimating the EE cost of SB (Swartz, Strath et al. 2000; Crouter, Churilla et

al. 2006; Rothney, Schaefer et al. 2008; Lyden, Kozey et al. 2011). Even with sophisticated

machine learning techniques, hip-mounted accelerometers still cannot accurately classify time

spent in SB or SB type (Staudenmayer, Pober et al. 2009; Freedson, Lyden et al. 2011; Kozey-

Keadle, Libertine et al. 2011; Lyden, Kozey Keadle et al. 2012). Additionally, the newest

ActiGraph GT3X+ accelerometer, which was built with an inclinometer (to improve detection of

posture and differentiate among lying, sitting, standing, and movement), still frequently

misclassifies SB type (Kozey-Keadle, Libertine et al. 2011; Carr and Mahar 2012; Hanggi, Phillips

et al. 2012; Lyden, Kozey Keadle et al. 2012). Given the similar angle of the hip for sitting and

standing (Parkka, Ermes et al. 2006; De Vries, Garre et al. 2011), it is not surprising that these

studies have found frequent misclassification of sitting as standing (and vice versa).

Page 65: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

50

Another significant limitation of hip-mounted accelerometers is that it is not clear if they

can be used effectively for measuring activity in pregnant or obese individuals. In both pregnancy

and obesity, hip-mounted accelerometers experience severe tilt, which changes the orientation of

the accelerometer and alters the accelerations being measured, significantly lowering their

accuracy for PA measurement (Shepherd, Toloza et al. 1999; Feito, Bassett et al. 2011; DiNallo,

Downs et al. 2012). Additionally, when worn on the hip, accelerometers must be secured using a

waist band, which may be uncomfortable for obese or pregnant individuals. To support this point,

Harrison et al. (Harrison, Thompson et al. 2011) asked pregnant women at 26-28 weeks gestation

to wear a pedometer and accelerometer for one week to measure free-living PA. Despite their

stated efforts to minimize tilt angle and maximize comfort, 37% of their sample did not meet the

minimum wear time requirements, and the authors attributed this in part to lack of comfort wearing

the waist band to hold the accelerometer.

Clearly, despite the widespread use of hip-mounted accelerometers, the hip is far from

perfect as a placement site for measuring EE, SB, or activity type. Recently, researchers have

renewed efforts to find alternate accelerometer locations for EE measurement and activity

classification. Some locations have included the lower back, chest, wrist, ankle, and thigh. The

strengths and weaknesses of each will be discussed in the following text. For a summary of current

findings of accelerometer performance for different body locations, please see Table 2.2.

First, the lower back and chest locations share many advantages and disadvantages of the

hip location. Since these three locations are on the torso, they all measure total body movement

and are minimally affected by erratic movements of the limbs, which can lower accuracy of EE

prediction and hurt classification of activity type (Rosenberger, Haskell et al. 2013). Additionally,

the chest location may be appealing in some contexts because it is worn under clothing and can

Page 66: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

51

easily be implemented in a device that also measures heart rate, allowing for measurement of

multiple physiologic variables with a single device (Brage, Brage et al. 2005). Similarly, the lower

back and chest have the advantage of being placed at the midline of the body (as opposed to the

left or right sides), which removes any difficulties with discrepancies that can occur when monitors

are worn on dominant vs. non-dominant sides of the body (Nichols, Morgan et al. 1999; Trost,

McIver et al. 2005). However, the lower back and chest locations also suffer similar limitations as

the hip in their poor measurement of SB and certain lifestyle activities (e.g., household chores or

cycling) and their lack of feasibility for continuous wear.

A popular accelerometer location in recent years is the wrist. Wrist-mounted

accelerometers are appealing because they can be worn like a watch, attracting minimal attention

and enhancing comfort. Also, wrist-worn accelerometers allow for continuous wear (assuming the

monitor is waterproof), which is likely to increase compliance. Within the last few years, the

National Health and Nutrition Examination Survey (NHANES) in the US and the Biobank study in

the UK switched from hip-mounted to wrist-mounted accelerometers in the hope of improving

compliance, which has been a significant issue in their surveillance efforts (UBCC 2009).

Preliminary data from the latest NHANES cycle has indicated that compliance may be slightly

improved, with average wear-time almost an hour longer per participant (Troiano and McClain

2012), lending support that the wrist may be a viable location for large-scale studies measuring EE

and some types of activity. Moreover, wrist-mounted accelerometers have long been recognized

for their utility as objective measures of sleep (Kripke, Mullaney et al. 1978; Mullaney, Kripke et

al. 1980) and have very high validity for measuring total sleep time and sleep quality (Jean-Louis,

Kripke et al. 2001); therefore, wrist-worn accelerometers may allow 24-hour measurement of EE,

activity type, and sleep (Webster, Kripke et al. 1982). Additionally, while regression approaches

Page 67: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

52

for wrist accelerometers have yielded lower accuracy than hip accelerometers (Montoye,

Washburn et al. 1983; Swartz, Strath et al. 2000), machine learning techniques have dramatically

improved the utility of wrist-worn accelerometers for measuring EE and activity type. Mannini et

al. (Mannini, Intille et al. 2013) found that machine learning algorithms developed from a wrist-

worn accelerometer classified 26 activities into four activity categories with about 84% accuracy,

which is only slightly lower than the classification accuracies of algorithms developed from single,

hip-mounted accelerometers (Staudenmayer, Pober et al. 2009; Freedson, Lyden et al. 2011; Trost,

Wong et al. 2012). Additionally, preliminary findings by Montoye et al. (Montoye, Dong et al.

2013) found that the wrist accelerometer could achieve high correlations (r=0.71) when predicting

EE in a simulated free-living environment, suggesting that use of machine learning could allow

accurate EE measurement using wrist accelerometers.

However, there are also a number of studies where direct comparisons of the hip and wrist

show that the hip has higher accuracy for EE prediction, activity type classification, and SB

measurement (Zhang, Rowlands et al. 2012; Rosenberger, Haskell et al. 2013). In a study by

Rosenberger et al. (Rosenberger, Haskell et al. 2013), participants performed 20 activities while

wearing wrist- and hip-mounted accelerometers and a portable metabolic analyzer (for a criterion

measure of EE). The algorithms they created for the hip accelerometer had better sensitivity and

specificity for SB (71% and 96% vs. 53% and 76%) and MVPA (70% and 83% vs. 30% and 69%)

measurement compared with the wrist accelerometer, and their algorithms for EE had lower errors

(0.55 vs. 0.82 METs) and higher correlations (r=0.72 vs. r=0.36) with the hip accelerometer than

the wrist monitor. They attributed the superiority of the hip location for EE measurement and SB

and MVPA classification to the fact the trunk of the body requires more energy to move, so

measurements of trunk movement represent the contraction of larger muscle masses. For SB

Page 68: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

53

classification, the authors postulated that SB can have significant variability in arm movement (i.e.,

working on the computer or driving vs. lying), diminishing the ability of a wrist-worn

accelerometer to detect differences between these behaviors and lifestyles activities involving

intermittent whole-body movement (i.e., sweeping or washing dishes). Despite some potential

drawbacks of the wrist-worn accelerometer, its potential to promote high compliance rates as well

as 24-hour measurement of PA, SB, and sleep make it an attractive site for accelerometer

placement.

Similar to wrist-mounted accelerometers, machine learning algorithms developed for

ankle-mounted accelerometers can provide good detection accuracy of ambulatory and lifestyle

activities (Mannini, Intille et al. 2013), and they have also been shown to accurately estimate

walking/running speeds (Foster, Lanningham-Foster et al. 2005), especially when used with

machine learning algorithms. In a direct comparison of wrist- and ankle-mounted accelerometers,

a study by Mannini et al. found overall classification accuracies of 95% for the ankle-mounted

accelerometer vs. 84.7% for the wrist-mounted accelerometer. Similarly, analyses by Montoye et

al. (Montoye, Dong et al. 2013) found that in a free-living simulation, ankle-mounted

accelerometers had significantly higher correlations with measured EE than wrist-mounted

accelerometers and similar correlations to thigh-mounted accelerometers (r=0.79, 0.71, and 0.80

for the ankle, wrist, and thigh, respectively). However, Dong et al. (Dong, Montoye et al. 2013)

found that a single ankle-mounted accelerometer was unable to detect differences between sitting

and standing since monitor motion and orientation were similar for these activities, rendering the

ankle-accelerometer ineffective for accurate measurement of SB. Additionally, compliance with

ankle-worn accelerometers is a significant limitation of ankle accelerometers because they cannot

be worn with high-top shoes or boots, and they look somewhat similar to police tethers (Mannini,

Page 69: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

54

Intille et al. 2013). Thus, ankle-mounted accelerometers may be useful as part of a multi-sensor

system, but as a single unit they do not function well for SB measurement or activity classification

and may have limited compliance.

A thigh-mounted accelerometer may represent a good compromise of the advantages and

disadvantages of the previously discussed accelerometer placements. Placement of an

accelerometer on the thigh should allow for good accuracy for prediction of EE as well as

measurement of SB and activity classification. Since the thigh is closer to the center of the body

than the wrist or ankle, an accelerometer on the thigh is likely to have superior validity for

estimating EE than one on the wrist or ankle, especially given that the thigh placement allows for

tracking of some of the largest muscle groups in the body (gluteal and quadriceps muscles).

Moreover, similar to an ankle-mounted accelerometer, a thigh-mounted accelerometer should be

able to accurately capture stepping motions, allowing for good activity type classification with

ambulatory and lifestyle activities. Finally, whereas time in SB type and breaks in SB are poorly

detected with the previously discussed accelerometer placements, thigh angle is different between

sitting and standing, making the differentiation between SB and non-sedentary activities relatively

simple using monitor orientation (for differentiating sitting from standing) and acceleration data

(for differentiating sitting from cycling).

Despite the theoretical superiority of a thigh-mounted accelerometer, this placement has

received little research attention until recently. Initial opposition to the thigh location was not due

to lack of utility; an early study by an engineering group (Veltink, Bussmann et al. 1996) showed

that acceleration signals from a thigh accelerometer could be used to distinguish between sitting

and standing and among stair use, walking, and cycling (Veltink, Bussmann et al. 1996).

However, the accelerometers used by this group were not available commercially, prohibiting their

Page 70: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

55

widespread use in PA measurement. At the time, the Caltrac, ActiGraph, and other commercially

available accelerometers were much too large and heavy to be worn on the thigh, and their size

would have made them unattractive to wear in free-living environments.

Use of a thigh-mounted accelerometer was recently brought to prominence by the

development of the activPAL accelerometer (PAL Technologies, Glasgow, Scotland), a small, thin

accelerometer specifically designed to be mounted on the thigh using a small strap or a sticky

patch. In numerous studies, the activPAL accelerometer has shown high accuracy for quantifying

time in SB, breaks in SB, and classifying SB type (Grant, Ryan et al. 2006; Kozey-Keadle,

Libertine et al. 2011; Aminian and Hinckson 2012; Lyden, Kozey Keadle et al. 2012), and it is

frequently used as a criterion measure of free-living SB (Hart, Ainsworth et al. 2011; Lord, Chastin

et al. 2011; Martin, McNeill et al. 2011). However, a major shortcoming of the activPAL is that it

relies on proprietary software for the determination of lying/sitting, standing, and stepping. While

the proprietary software makes the device user friendly, it does not allow researchers to identify

important aspects of movement or improve on the company’s algorithms to allow the activPAL to

predict EE or PA type. Additionally, at $600+ per monitor, the activPAL is considerably more

expensive than the $300 ActiGraph GT3X+ or $225 GENEActiv, which are both now as small as

the activPAL and have the added advantages of being water resistant (ActiGraph) or waterproof

(GENEA) and allowing for raw data recording and extraction.

Although the activPAL is limited for its utility to predict EE or classify non-sedentary

activity type, its development and validation provides a proof-of-concept for the utility of the thigh

location for measuring classifying activity type and measuring SB and EE. Recently, a study by

Skotte et al. (Skotte, Korshoj et al. 2012) developed a novel method to use a thigh-mounted

ActiGraph accelerometer for measuring a total of six sedentary, ambulatory, and lifestyle activities.

Page 71: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

56

Using a combination of accelerometer orientation and acceleration data, they were able to correctly

classify the six activities with close to 99% accuracy in a simulated free-living protocol.

Additionally, the thigh-mounted accelerometer had significantly better sensitivity and specificity

(98.2% and 93.3%) for measuring free-living SB than the hip-mounted accelerometer (72.8% and

58.0%).

Moreover, analysis of a single, thigh-mounted accelerometer from a wireless accelerometer

system developed at Michigan State University showed 78.3% classification accuracy for 14

sedentary, ambulatory, lifestyle, and exercise activities (Dong, Montoye et al. 2013). Additionally,

preliminary analyses of EE measurement accuracy indicated that the thigh-mounted accelerometer

achieved high correlations (r=0.80) with criterion-measured EE (Metcalf, Curnow et al. 2002).

However, the accelerometer used in the wireless system is not commercially available and only

measures two axes of acceleration data. Validation of a triaxial, commercially available

accelerometer mounted on the thigh for classifying activity type and measuring SB and EE is a

logical next step in determining the utility of the thigh location as a measurement site.

In conclusion, while it appears that no single accelerometer placement is ideal for all

movements or all contexts, the thigh location may represent the best compromise of comfort and

measurement accuracy. The hip is well researched and provides good estimation of total body

movements, ambulatory activities, and EE. Additionally, the wrist seems to have slightly lower

accuracies for activity type and EE prediction, but the ability to record sleep measures and improve

participant compliance rates makes the wrist appealing for large studies and total day recording.

The thigh appears to be a good compromise of the hip and wrist locations. Since the thigh is very

close to the torso, it is less affected by erratic limb movements than the wrist or ankle. Also,

placement on the thigh is beneficial for detecting certain lifestyle and cycling activities and shows

Page 72: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

57

the greatest promise for accurate measurement of SB. Additionally, with the low-profile, water

resistant/waterproof designs of the ActiGraph and GENEA accelerometers, thigh-mounted

accelerometers could be placed under clothing with a small strap or sticky patch, allowing for

continuous wear with minimal discomfort. We believe that by applying machine learning

techniques to thigh-mounted accelerometer data, we can develop algorithms with better accuracy

for classifying activity type and measuring EE and SB than can be achieved with hip- or wrist-

mounted accelerometers.

Page 73: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

58

Table 2.2. Comparison of different monitor placements for activity classification accuracy and EE prediction accuracy.

Study Participant

characteristics

Placement of

monitors

Number and types of

activities

Activity

classification

accuracy

EE prediction

accuracy

Dong et al.

(Dong, Montoye

et al. 2013)

40 adults Right wrist,

thigh, and ankle

14 sedentary, sedentary,

ambulatory, lifestyle, and

exercise activities (11

distinct, 3 variations).

Lab-based protocol.

Thigh: 78.3%

Ankle: 78.3%

Wrist: 71.5%

N/A

Mannini et al.

(Mannini, Intille

et al. 2013)

33 adults Right wrist and

ankle

26 sedentary, cycling,

ambulatory, and

lifestyles activities (10

distinct, 16 variations)

Ankle: 95.0%

Wrist: 84.7%

N/A

Staudemayer et

al.

(Staudenmayer,

Pober et al.

2009)

48 adults Right hip 20 activities (18 distinct,

2 variations)

88.8% RMSE (METs):

ANN: 1.22

Linear regression:

1.51 – 2.09]

Bias (METs):

ANN: 0.05

Linear regression:

-0.30 – -1.21

Zhang et al.

(Zhang,

Rowlands et al.

2012)

60 adults Right hip, right

wrist, and left

wrist

12 sedentary, lifestyle,

and ambulatory activities

(8 distinct, 4 variations).

Combination of lab-

based and simulated free-

living protocol.

Hip: 99.1%

Right wrist:

97.0%

Left wrist:

95.9%

N/A

Skotte et al.

(Skotte, Korshoj

et al. 2012)

17 adults Right thigh 6 sedentary, lifestyle, and

ambulatory activities

Sensitivity and

specificity were

both 99% for

activity

discrimination

N/A

Page 74: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

59

Table 2.2 (cont’d.)

Montoye et al.

(Montoye,

Washburn et al.

1983)

21 adults Wrist hip and

left wrist

14 ambulatory and

exercise activities (6

distinct, 8 variations).

Lab-based protocol.

N/A Reliability:

Wrist: r = 0.74,

Hip: r = 0.63

Standard error:

Wrist: 7.9

ml/kg/min, Hip:

9.2 ml/kg/min

Montoye et al.

(Montoye, Dong

et al. 2013)

27 adults Right wrist,

thigh, and ankle

14 sedentary, sedentary,

ambulatory, lifestyle, and

exercise activities (11

distinct, 3 variations).

Simulated free-living

protocol.

N/A Correlations (r):

Thigh: 0.80

Ankle: 0.79

Wrist: 0.74

RMSE (METS):

Thigh: 1.61

Ankle: 1.69

Wrist: 1.85

Rosenberger et

al.

(Rosenberger,

Haskell et al.

2013)

37 adults Dominant hip

and wrist

13 sedentary, lifestyle,

cycling, and ambulatory)

activities. Combination

of lab-based and

simulated free-living

protocol.

N/A Correlations (r):

Hip: r = 0.72

Wrist: r = 0.36

Swartz et al.

(Swartz, Strath

et al. 2000)

70 adults Right hip and

dominant wrist

27 lifestyle, occupational,

exercise, and ambulatory

activities. Combination

of lab-based and

simulated free-living

protocol

N/A Correlations (r):

Wrist: r = 0.18

Hip: r = 0.56

Page 75: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

60

Laboratory-based vs. free-living settings

Ultimately, measurement techniques need to be validated in a context similar to the

setting in which they will be used. When a technique is first tested, activities are generally

performed in a laboratory-based setting, where maximum control can be exerted over the timing

and order of activities performed. In these laboratory-based studies, activities are usually

performed in order of increasing intensity for at least 5-7 minutes, allowing participants to reach

steady-state EE (where EE matches the demands of the activity). Additionally, the activities

must be performed in a specific manner (i.e., walking/jogging speeds and cycling cadences are

the same for all participants), so that there is minimal variability in the activities (Freedson,

Melanson et al. 1998).

Laboratory-based validation studies are a crucial first step in the testing of measurement

devices because they provide a proof of concept that a given measurement device or method can

work well in a highly controlled environment. Additionally, highly valid criterion measures,

such as metabolic analyzers for measuring EE and DO for determining activity type, are

available for use in laboratory-based settings. However, once measurement methods are

validated in a laboratory, they must then be tested in a free-living environment since laboratory

conditions are very different from activities and settings that people encounter in their everyday

lives. In free-living settings, people are seldom engaged in steady-state activities and do not

normally perform activities for defined amounts of time, and there can be substantial variability

within activity types. For example, free-living walking rarely occurs at a constant speed, and

preferred walking speed can differ considerably among individuals. Additionally, treadmill and

non-treadmill walking elicit different gait patterns (Dingwell, Cusumano et al. 2001), lowering

the potential to generalize detection of treadmill walking to the detection of free-living walking.

Page 76: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

61

To support this point, a study by Gyllensten et al. (Gyllensten and Bonomi 2011) found that an

ANN created in the laboratory using data from a single accelerometer (located on the lower

back) has 94% accuracy for classifying five categories of activities, but this accuracy dropped to

75% accuracy when used in a free-living setting (with IDEEA used as criterion). Additionally,

Lyden et al. (Lyden, Keadle et al. 2013) found that an ANN created in the laboratory performed

well in the laboratory but very poorly when applied to a free-living scenario, with biases of 33%

and 73% when estimating MET-hours and minutes spent in MVPA, respectively. Therefore,

they recreated their ANNs using free-living data. These findings have been further confirmed by

other studies (Bao and Intille 2004; Ermes, Parkka et al. 2008; Crouter, Kuffel et al. 2010),

providing strong evidence that laboratory validations must be applied to free-living settings with

caution. Therefore, it is important to incorporate aspects of a free-living environment into

validation protocols so that results obtained can be applied to real-world situations.

However, conducting validation studies in a true free-living environment is not feasible

due to the lack of a suitable criterion measure for measuring EE or activity type. Doubly-labeled

water is a commonly used method for assessing free-living EE, but this method only works well

for measuring total EE over a period of 1-2 weeks and cannot yield information about timing,

type, duration, or intensity of PA. Therefore, doubly-labeled water cannot give an indication of

how well a measurement method predicts EE for specific activities. Additionally, since most

activity monitors cannot be worn continuously (i.e., must be removed for showering and

sleeping), doubly-labeled water captures a significant amount of EE that is not recorded by the

monitors, precluding a comparison of monitor output to total EE.

Indirect calorimetry measured using a portable metabolic analyzer has also been used as a

criterion measure of field-based EE, but use of a metabolic analyzer can result in participant

Page 77: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

62

reactivity and does not allow participants to perform many normal activities. While portable

metabolic analyzers allow for participants to perform activities outside of a laboratory, the

analyzer requires participants to wear a mask (for collecting data on expired gas volumes and

concentrations). Therefore, consumption of food or beverage is prohibited during the course of

wearing the equipment. Additionally, participants must wear a shoulder harness with multiple

pieces of equipment strapped to the participants’ backs, making lying or reclining uncomfortable

and unnatural. Another potential criterion measure could be use of whole room indirect

calorimeter chambers, which can measure oxygen consumption (to estimate EE) without

participants needing to wear any equipment. While this setting allows participants to perform

some activities as they would in a free-living setting, being confined to a small room is unnatural

and necessitates the use of exercise machines (e.g., treadmills, stair steppers, cycle ergometers)

to perform many lifestyle and ambulatory activities, making it a poor substitute for true free-

living. Additionally, whole room indirect calorimeter chambers are very expensive and are only

located in a few laboratories around the country, making accessibility to them very difficult.

For the measurement and classification of activity type, DO is commonly used as a

criterion method for measuring free-living activity. DO allows researchers to capture

participants’ actions in the field (Santos-Lozano, Marin et al. 2012), but the act of being

observed is likely to cause reactivity in participants (McKenzie 2002), reducing the

generalizability of findings to a true free-living setting. Additionally, DO would have to be

performed for a period of days or weeks to capture participants’ true activity patterns (Santos-

Lozano, Marin et al. 2012), but this is simply not feasible in a research context and would pose a

significant burden on participants and observers. Finally, it is important to validate measurement

techniques with participants performing a variety of activities. In free-living settings, adults

Page 78: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

63

spend the majority of their time in SB and much less time in household, exercise, or sport

activities. Thus, observing participants for a shorter period of time would likely result in a lack

of variety in activities detected, hindering the ability of the measurement technique to classify

important lifestyle, household, or exercise activities and limiting the utility of the measurement’s

validation to only the population in which it was validated.

Clearly, both laboratory and free-living validation studies are subject to limitations, but a

combination of the two, also called ‘simulated free-living,’ may be an optimal balance of the two

settings. In simulated free-living, researchers can exert control over the types of activities and

the minimum amount of time participants need to perform the activities, but the participants can

choose the amount of time and order in which they perform the activities as well as technique

they use to perform each activity (i.e., not everyone walks at the same speed or sweeps the same

way). Also, since simulated free-living allows many activities to be performed in a relatively

short period of time, both DO and indirect calorimetry can be utilized for criterion measures of

activity type and EE. Therefore, simulated free-living provides better generalizability to real-

world conditions than strict laboratory-based protocols, but it does not face the limitations of

trying to find an appropriate criterion measure for testing the measurement methods in a true

free-living setting. Importantly, simulated free-living has shown promise in several recent

validation studies of accelerometers (Sun, Schmidt et al. 2008; Rumo, Amft et al. 2011),

providing a strong case for its use in the current study. Once the accelerometers and machine

learning algorithms have been validated in a simulated free-living setting, they can then be used

in true free-living settings with reasonable confidence of their accuracy.

Page 79: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

64

Accelerometer reliability

In order to be used effectively for measurement of PA, EE, and SB, accelerometers must

exhibit high intra- and inter-monitor reliability. Reliability of accelerometers has been assessed

in two main ways: 1) laboratory studies where accelerometers are placed on mechanical shakers

and 2) accelerometers are placed either next to each other or on the opposite side of the body

(i.e., left vs. right hip) and worn in free-living settings. This section will focus on reliability

studies of the ActiGraph and GENEA accelerometers since these are the two accelerometers

being used in the current study.

In laboratory studies using mechanical shakers, accelerometers generally exhibit very

high intra- and inter-monitor reliability over the range of intensities encountered in most lifestyle

activities (indicated by high intraclass correlations and low coefficients of variation [CVs]). The

ActiGraph accelerometer has been tested extensively for intra- and inter-monitor reliability, with

intra-monitor intraclass correlations ranging from 0.84-0.92, inter-monitor intraclass correlations

from 0.71-0.99, and CVs ranging from 1-9%, (Metcalf, Curnow et al. 2002; Brage, Wedderkopp

et al. 2003; Esliger and Tremblay 2006; McClain, Sisson et al. 2007; Santos-Lozano, Marin et al.

2012; Santos-Lozano, Torres-Luque et al. 2012; Troiano and McClain 2012). However, Santos-

Lozano et al. (Santos-Lozano, Marin et al. 2012) found that CVs increased considerably (both

intra- and inter-monitor) at very high and very low intensities when on the shaker. The high CV

at low intensities is not concerning since the high CV is likely being driven by the very low mean

acceleration during low-intensity activities. Similarly, the poor CV achieved during high

intensity shaking is not particularly concerning for the current study since the ActiGraph

placements will be on the hip and mid-thigh. However, the high CV at high intensities may be

problematic for studies with the accelerometer placed on the wrist or ankle, where accelerations

Page 80: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

65

are much more rapid than those experienced at the hip. Importantly, a study by Brage et al.

(Brage, Brage et al. 2003) discovered that raw acceleration data has better inter-monitor

reliability than activity count data, lending further support to the use of raw acceleration data

with machine learning algorithms.

In free-living settings, intra- and inter-monitor reliability can be assessed by putting

monitors on the same body part but on the opposite side of the body (i.e., left vs. right hip).

McClain et al. (McClain, Sisson et al. 2007) tested the inter-monitor reliability by comparing

outputs from ActiGraph accelerometers mounted on the left vs. right hips and found an intraclass

correlation of 0.99 and CV of 4.9% when measuring MVPA, providing evidence of the real-

world reliability of the ActiGraph. McClain’s work has been supported by several other studies

comparing accelerometers placed on the right and left hips (Brage, Wedderkopp et al. 2003;

Vanhelst, Baquet et al. 2012). Additionally, Welk et al. (Welk 2002) conducted a study in which

participants performed repeated walking and running trials on a treadmill while wearing only one

monitor at a time and found that intra-monitor CVs were very similar to inter-monitor CVs.

They postulated that differences seen in the accelerometer output were likely due to slight

differences in monitor placement rather than the variation in the accelerometers themselves,

providing further evidence that the ActiGraph has good inter- and intra-monitor reliability in the

free-living environment.

Additionally, although the GENEA accelerometer is relatively new, one study by Esliger

et al. (Esliger, Rowlands et al. 2011) has evaluated the reliability of the GENEA in the laboratory

and indirectly in a field-based setting. Using a mechanical shaker, they found an intra-monitor

CV of 1.4% and an inter-monitor CV of 2.1% when assessing 47 monitors in 15 different shaker

speeds. Also, in a free-living setting, the GENEA accelerometers worn on the left and right

Page 81: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

66

wrists both showed excellent validity (r=0.83-0.86) for estimating VO2 (Esliger, Rowlands et al.

2011), providing preliminary evidence of the reliability of the GENEA in both laboratory and

free-living settings.

In summary, reliability studies performed in laboratory and free-living settings indicate

that the ActiGraph and GENEA exhibit good intra- and inter-monitor reliability for measuring

MVPA as well as raw accelerations, supporting their use in the current study.

Identifying non-wear

Determining when accelerometers are being worn vs. when they are removed is very

important for calculating daily PA and SB. Logs or diaries can be used to help determine wear-

time, but these are not ideal for large studies since they are subject to error in recording and

increase participant and research burden. When establishing wear-time from accelerometer data,

there are several criteria that must be addressed: non-wear vs. SB, minimum hours/day of wear,

and minimum days/week of wear.

The first difficulty in identifying wear-time is distinguishing between non-wear and SB,

both of which often result in accelerometers registering zero counts/min. Many data reduction

methods have been created in order to identify and remove accelerometer non-wear time by

setting a minimum amount of time with continuous zero counts/min. This minimum time has

been set anywhere from 10 minutes (Riddoch, Bo Andersen et al. 2004) to 90 minutes (Choi, Liu

et al. 2011), and there is no consensus on the optimal length of 0 counts to determine non-wear.

Continuous wear and implementation of machine learning algorithms using raw data can

help more effectively deal with non-wear. Conventionally, hip-mounted accelerometers were

Page 82: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

67

worn during waking hours and removed at night and before performing water-based activities

(Welk 2002). Frequent removal of accelerometers is likely to lower compliance when

participants forget to put the monitors back on in the morning or after swimming or showering

(Kinder, Lee et al. 2012), and the data collected with the accelerometers during times of non-

wear do not reflect actual activity levels. Hip-mounted accelerometers have been worn

continuously in several studies (Hjorth, Chaput et al. 2012; Kinder, Lee et al. 2012), but having

an accelerometer protruding from the hip could be uncomfortable for sleeping and lower

participant compliance. Accelerometer placement on the wrist or thigh allows for continuous

wear, providing an advantage of these sites over the hip. As previously mentioned, wrist-worn

accelerometers can be worn continuously and have improved compliance in NHANES data

collection (Troiano and McClain 2012). Additionally, studies using the activPAL accelerometer

(Hart, Ainsworth et al. 2011; Lord, Chastin et al. 2011; Martin, McNeill et al. 2011) indicate that

thigh-mounted accelerometers can be worn continuously with minimal subject discomfort (Craft,

Zderic et al. 2012; Feito, Bassett et al. 2012).

Choice of accelerometer may allow or preclude continuous wear. While the newest

ActiGraph models are said to be waterproof, the GT3X+ and all older models are water resistant

at best, and the company recommends their removal for water-based activities (ActiGraph 2013).

Therefore, use of protective sleeves or barriers is necessary to allow for continuous wear.

Conversely, GENEA accelerometers are waterproof and can be worn 24 hours/day.

Additionally, GENEA accelerometers contain a skin temperature sensor, which can help with the

determination of wear-time and remove the need for using the data reduction techniques

described above for identifying wear-time. Therefore, GENEA accelerometers do not need to be

Page 83: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

68

removed for any reason during data collection, and if they are, the temperature sensor will help

to determine exact wear-time, making them well-suited for use in free-living settings.

Moreover, machine learning algorithms are designed for pattern recognition and, with

proper development and use of raw data, should be able to recognize non-wear as distinct from

SB. The acceleration and monitor orientation signals when an accelerometer is not being worn

are likely very different than when the monitor is being worn during SB because when someone

is engaged in SB, even the smallest movements will be detected by the accelerometer, allowing

differentiation of SB from non-wear. Therefore, when developing machine learning algorithms,

it is important to include non-wear as an activity so that the algorithms can detect non-wear as

distinct from SB.

Additionally, there is a lack of consensus on the minimum amount of time per day a

monitor must be worn in order to yield an accurate reflection of someone’s daily activity

patterns, with minimal wear-time ranging from two hours/day (Brownson, Hoehner et al. 2009)

to 16 hours/day (Slootmaker, Schuit et al. 2009), although most studies require a minimum of 8-

12 hours/day (Masse, Fuemmeler et al. 2005). Finally, the minimal number of days of valid data

needed for an accurate measure of true PA levels has ranged from one day (Le Masurier, Sidman

et al. 2003) to seven days (Matthews, Ainsworth et al. 2002), with most studies using 3-4 as a

minimum number of valid days (Trost, McIver et al. 2005). Choice of minimum number of

continuous zeroes for non-wear, the minimal number of hours/day of accelerometer wear, and

the minimum number of day of wear all can significantly affect the results of subsequent

analyses regarding total PA or SB and activity patterns (Trost, McIver et al. 2005; Evenson and

Terry 2009; Oliver, Badland et al. 2011; Herrmann, Barreira et al. 2012). We expect that a more

Page 84: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

69

accurate way of recognizing non-wear may help to advance discussion on these compliance

issues.

It is important to note that these data reduction rules have been established with the intent

of achieving a certain test-retest reliability (usually r=0.80-0.90) when using linear regression

approaches for measuring PA and/or EE (Welk 2002). With improvements in accelerometer

technology, continuous wear of monitors, and machine learning techniques for data processing

and analysis, these reduction rules may no longer apply. However, this issue lies outside the

scope of the current study.

Summary of current evidence and future directions

In conclusion, there is considerable evidence linking both PA and SB to poor

cardiometabolic health. However, without improvements in the measurement of PA and SB

along with accurate determination of activity type, we will be limited in our ability to detect the

true risks of SB or monitor the effectiveness of interventions at reducing SB. Machine learning

techniques show great potential to improve measurement of SB as well as EE and classification

of activity type, but their current complexity may prohibit wide adoption by PA researchers.

The current study aims to develop ANN algorithms for hip-, wrist-, and thigh-mounted

accelerometers using simple-to-compute features from the accelerometer data and freely

available software to allow for relatively simple creation and testing of the ANNs. Our study

will directly compare the accuracy of the hip-, wrist-, and thigh-mounted accelerometers to

measure EE, SB, and activity type in a simulated free-living setting.

Page 85: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

70

CHAPTER 3

VALIDATION AND COMPARISON OF ACCELEROMETERS LOCATED ON THE

WRISTS, HIP, AND THIGH FOR FREE-LIVING ENERGY EXPENDITURE

PREDICTION

ABSTRACT

The purpose of this study was to develop, validate, and compare energy expenditure

prediction models for accelerometers placed on the wrists, hip, and thigh. A secondary purpose

was to achieve high measurement accuracy using simple accelerometer features as input

variables in energy expenditure prediction models. METHODS: Forty four healthy adults

participated in a 90-minute simulated free-living activity protocol. During the protocol,

participants engaged in a total of 14 different sedentary, ambulatory, lifestyle, and exercise

activities for 3-10 minutes each. Participants chose the order, duration, and intensity of

activities. Four accelerometers were worn (right and left wrists, right hip, and right thigh) in

order to predict energy expenditure compared to that measured by the criterion measure (portable

metabolic analyzer). Artificial neural networks were created to predict energy expenditure from

each accelerometer using a leave-one-out cross-validation approach. Accuracy of the neural

networks was evaluated using Pearson correlations, root mean square error, and bias. Several

models were developed using different input features in order to determine those most relevant

for use in the models. RESULTS: All four accelerometers achieved high measurement

accuracy, with correlations >0.80 for predicting energy expenditure. The thigh accelerometer

provided the highest overall accuracy (r=0.89) and lowest root mean square error (1.05 METs),

and the differences between the thigh and the other monitors was more pronounced when fewer

input variables were used in the predictive models. None of the predictive models had an overall

Page 86: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

71

bias for estimation of energy expenditure. CONCLUSIONS: A single accelerometer placed on

the thigh provided the highest accuracy for energy expenditure prediction, although monitors

worn on the wrists or hip can also be used with high measurement accuracy.

Page 87: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

72

INTRODUCTION

Physical activity (PA) has long been recognized for its beneficial effects on many aspects

of health. Because of these known health benefits, the most recent PA guidelines advocate that

adults obtain a minimum of 150 min/week of moderate-intensity PA, 75 min/week of vigorous-

intensity PA, or a combination of the two (PAGAC 2008). Moderate- and vigorous-intensity PA

can be defined according to the amount of energy they elicit, with moderate-intensity PA being

any activity that elicits an energy expenditure (EE) of at least 3.0 times, but less than 6.0 times,

the resting level (METs) and vigorous-intensity PA as an activity that elicits at least 6.0 METs.

Accurate measurement of EE is vital for understanding prevalence of meeting PA

recommendations, identifying populations who may benefit from interventions aimed at

increasing PA, and better understanding the relationship between PA and health.

Objective PA measurement tools such as activity monitors have shown considerable

promise due to their relative ease of use and accurate measurement of PA for days or weeks at a

time (Welk 2002). Accelerometer-based activity monitors in particular have seen dramatically

increased use for measurement of free-living PA. Accelerometers are generally worn on the hip

and record accelerations of the trunk as a person moves. These accelerations have traditionally

been used as an independent variable in linear regression equations to estimate EE. Linear

regression approaches to prediction of EE are appealing due to their simplicity and their high

accuracy in initial validation studies, which focused on measuring the EE of ambulatory

activities (i.e., walking and running) in controlled settings (Freedson, Melanson et al. 1998).

However, the linear relationship between accelerations and EE does not seem to hold when

applied to non-ambulatory activities or free-living environments, resulting in much poorer

prediction accuracy in such situations (Hendelman, Miller et al. 2000; Swartz, Strath et al. 2000).

Page 88: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

73

To overcome these limitations, researchers have explored several avenues to improve PA

measurement. One approach involves the use of more than one monitoring device to measure

accelerations and/or other physiologic variables (i.e., heart rate) to improve EE measurement.

Use of multi-monitor systems has shown promise for improving EE measurement in several

studies (Zhang, Pi-Sunyer et al. 2004; Albinali, Intille et al. 2010; Dong, Biswas et al. 2013), but

the use of multiple monitors dramatically increases participant and researcher burden, preventing

these methods from being feasible for use in large surveillance, intervention, or epidemiologic

studies.

Another approach to improving EE prediction has involved using techniques other than

linear regression for modeling the relationship between acceleration data and EE. Machine

learning, a branch of artificial intelligence, has become a popular modeling technique and has

been shown to improve EE measurement in both laboratory-based and free-living settings

(Rothney, Neumann et al. 2007; Staudenmayer, Pober et al. 2009; Freedson, Lyden et al. 2011).

However, there are still many unresolved questions regarding use of machine learning for

predicting EE. First, machine learning modeling may allow for accurate prediction of EE using

accelerometers placed on body locations other than the hip (i.e., wrist, ankle, and thigh), but it is

unclear if accelerometers placed on alternate body locations can achieve the same measurement

accuracy as a hip-mounted accelerometer. The wrist is an appealing accelerometer placement

site due to its utility in measuring sleep and activity type as well as ease of wear (Kripke,

Mullaney et al. 1978; Jean-Louis, Kripke et al. 2001; Zhang, Rowlands et al. 2012; Mannini,

Intille et al. 2013). Additionally, accelerometers worn on the thigh have shown high accuracy

for measuring ambulatory activity and sedentary behavior (Grant, Ryan et al. 2006; Ryan, Grant

Page 89: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

74

et al. 2006). Despite the potential for the wrist and thigh as measurement sites, there is very

limited evidence regarding their utility for measuring EE.

Second, a current limitation of using machine learning to model accelerometer data is that

machine learning models are much more complex than traditional linear regression approaches,

both in the extraction of useful information (features) from accelerometer data to use as inputs

into the models as well as the model creation itself. This complexity currently limits the use of

machine learning and keeps it from being used on a wider scale. However, there is some

evidence that the process of developing and using machine learning can be simplified without

compromising measurement accuracy. In 2009, Staudenmayer et al. (Staudenmayer, Pober et al.

2009) took a large step toward simplifying the use of machine learning modeling. They used the

R statistical software (a freely available, open-source software package) to develop a specific

type of machine learning model (an artificial neural network [ANN]) to predict EE and activity

type. Additionally, they used simple, time-domain features (percentiles of the acceleration signal

and autocorrelation) as input variables and achieved dramatically improved EE estimations over

linear regression approaches. However, it is unknown whether the features they used as input

variables in their models represent an optimal set of input variables for maximizing EE

prediction accuracy.

Third, most validation studies are carried out in laboratory-based settings, which allows

for good control of type, duration, and intensity of activities performed. However, there is

considerable evidence that laboratory-based validation techniques have considerably lower

accuracy when applied to free-living situations (Swartz, Strath et al. 2000; Crouter and Bassett

2008; Lyden, Keadle et al. 2013).

Page 90: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

75

The purpose of this study was fourfold: 1) to validate models for estimation of EE from

accelerometers worn on the wrists, hip, and thigh for prediction of EE in a simulated free-living

setting, 2) compare the accuracy of EE prediction for accelerometers located on the wrist, thigh,

and hip, 3) compare accuracies achieved by the left and right wrists, and 4) compare different

input features to determine an optimal set of simple input features that maximizes prediction

accuracy while minimizing complexity of the machine learning technique.

Page 91: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

76

METHODS

Summary of protocol

Participants were brought into the Human Energy Research Laboratory to participate in

a 90-minute simulated free-living protocol. For the protocol, participants performed 14

activities for between 3-10 minutes, with order and duration of activities left up to participants.

During the protocol, participants wore a portable metabolic analyzer (for a criterion measure of

EE) and four accelerometers.

Participants

A total of 44 adults (22 male, 22 female) were recruited from the area of East Lansing,

MI via email, flyers, and word of mouth for participation in this study. Exclusion criteria

included the following:1) if participants had known health conditions that prevented them from

being able to perform MVPA safely, 2) if they were wheelchair-bound or had orthopedic

limitations that invalidated the use of accelerometry for activity measurement, or 3) if they fell

outside the age range of 18-44 years.

Anyone over the age of 44 was excluded from participation as the American College of

Sports Medicine asserts that those aged 45 and above are at higher risk for acute cardiovascular

complications with exercise (ACSM 2009), and we did not have medical personnel available

during testing to approve vigorous PA for older individuals. Anyone under the age of 18 was

excluded from these preliminary validations because children and adolescents have a higher

relative EE for activities than adults due to normal growth and maturation (Krahenbuhl and

Williams 1992), and their activity patterns are different than those of adults (Bailey, Olson et

Page 92: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

77

al. 1995). This study was approved by the Michigan State University Institutional Review

Board prior to participant recruitment. Details of the study were described to each participant

immediately upon arriving at the Human Energy Research Laboratory, and written informed

consent was obtained prior to proceeding with the protocol.

Instrumentation

The instruments used in this study were ActiGraph GT3X+ accelerometers, GENEActiv

accelerometers, and an Oxycon Mobile portable metabolic analyzer. The Oxycon portable

metabolic analyzer provided a criterion measure of EE. The accelerometers and portable

metabolic analyzer were synchronized to an external clock before each test; descriptions of the

accelerometers and metabolic analyzer follow. Pictures of the equipment can be seen in Figure

A.1 in the Appendix.

ActiGraph accelerometers

The ActiGraph (ActiGraph LLC, Pensacola, FL) is the most commonly used accelerometer

on the market for PA research, and there is an abundance of literature regarding its reliability and

validity for measurement of PA (Freedson, Melanson et al. 1998; Matthew 2005). Two GT3X+

models were placed on each participant during the study. One accelerometer was placed on the

midline of the right thigh, one third of the way between the hip and knee and adhered to the leg

with hypoallergenic sticky tape. The other ActiGraph was mounted on the right hip at the anterior

axillary line with an elastic belt. The ActiGraph GT3X+ records raw accelerations of up to ± 6

times gravitational force (6g) in three axes of movement. For the current protocol, the GT3X+

accelerometers recorded at a rate of 40 samples per second (40 Hz).

Page 93: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

78

GENEA accelerometers

The GENEActiv (Activinsights Ltd, Kimbolton, Cambridgeshire, UK) is a new

accelerometer that has recently been validated for PA measurement (Esliger, Rowlands et al.

2011). Like the ActiGraph, the GENEA records raw data of up to ± 6g in three axes of

movement. The GENEAs were set to record acceleration data at a rate of 20 Hz for the current

study. The GENEA is shaped like a watch and comes with a standard wrist strap, allowing for

easy attachment to the wrist. Participants wore two GENEA accelerometers (one on each wrist)

for this study. Each GENEA was fastened securely to the dorsal side of the wrist, between the

styloid processes of the radius and ulna (Esliger, Rowlands et al. 2011).

The acceleration data for all four accelerometers were time stamped and stored within the

monitors and later were downloaded to a computer for analysis. Additionally, the accelerometers

were oriented so that the x-axis was the vertical axis, the y-axis was the medial-lateral axis, and the

z-axis was the anterior-posterior axis.

Oxycon portable metabolic analyzer

The Oxycon Mobile (Cardinal Health, Yorba Linda, CA) portable metabolic analyzer was

used to measure oxygen consumption (VO2) and carbon dioxide production (VCO2) during 13 of

the 14 activities performed in the protocol (EE was recorded but not analyzed for the non-wear

activity). The Oxycon is lightweight (950 g) and was worn on the back using a shoulder harness.

Participants were fitted with a breathing mask (held in place by a mesh cap), which was attached to

a digital turbine flowmeter and gas sampling tube, allowing the analyzer to measure inspired and

expired air volume so that VO2 and VCO2 could be calculated on a breath-by-breath basis. VO2

data were expressed in ml/kg/min and converted to METs (by dividing VO2 by 3.5) for analysis.

Page 94: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

79

Prior to each test, the Oxycon was calibrated according to manufacturer’s specifications to ensure

accurate measurements for flow rate and gas concentration. The Oxycon has been shown to

provide valid VO2 measures over a range of exercise intensities (Rosdahl, Gullstrand et al. 2010;

Akkermans, Sillen et al. 2012) and was used as the criterion measure for EE in this study.

Procedure

Each participant reported to the Human Energy Research Laboratory for one visit.

Participants were asked to refrain from eating for three hours prior to visiting the laboratory to

minimize risk of discomfort while performing the activities and because food ingestion can affect

EE values. Details of the study were discussed with each participant. Written informed consent

was obtained, and a physical activity readiness questionnaire was administered to ensure that the

participant was healthy and had no contraindications to engaging in MVPA. If participants had

answered ‘yes’ to any question on the questionnaire, they would have been asked to obtain

physician approval before being able to participate in the study; however, this did not occur. Next,

participant weight and height were taken by trained research assistants according to standardized

methods (Malina 1995). Weight was measured to the nearest 0.1 kg using a Seca digital scale

(Seca, Hanover, Germany), with shoes off and weight balanced on the center of the scale. Height

was measured to the nearest 0.1cm using a Harpenden stadiometer (Holtain Ltd., Crymych, United

Kingdom). Before measurement, the participant removed his/her shoes, stood erect with feet flat

on the floor, head aligned in the Frankfurt plane, and the back of the feet, shoulders, and head

resting against the back of the board. Two measurements were taken and averaged for both weight

and height. If the two weights differed by more than 0.3 kg or if the two heights differed by more

than 0.4 cm, a third measurement was taken, and the closest two were averaged. Body mass index

(BMI) was calculated by dividing body weight by the square of height (kg/m2). Age was assessed

Page 95: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

80

by asking participants to state their age in years, and handedness was assessed by asking

participants which hand they prefer to use for the majority of activities.

Each participant wore the Oxycon metabolic analyzer, one ActiGraph on the hip, another

ActiGraph on the thigh, one GENEA on the left wrist, and one GENEA on the right wrist while

performing 14 activities (activity descriptions provided in Table 3.1, and pictures of the activities

being performed can be found in Appendix D). These activities comprised a range of intensities

from sedentary to vigorous and represented a mixture of sedentary, ambulatory, exercise, and

lifestyle. Ambulatory activities (walking, running) are common in accelerometer validation

literature; however, we added the sedentary, exercise, and lifestyle activities to determine the

potential for the four accelerometers to measure a range of activity types and intensities often seen

in free-living settings. Additionally, we added a non-wear activity so that the ANNs would be able

to recognize when the accelerometers were not being worn, allowing for easy exclusion of non-

wear time from data analyses. The non-wear activity was not included in our analysis of EE

prediction.

Page 96: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

81

Table 3.1. Activities performed during the simulated free-living protocol.

Activity

Category Activity

Activity

Intensity Description of Activity*

Sedentary

(SE)

Lying down (T1) Sedentary Lying on a mat on the floor

Reading (T2) Sedentary Reading a magazine article while

sitting at a table

Computer (T3) Sedentary Sitting and playing a computer game

that involves mouse clicking and typing

Standing

(ST) Standing (T4) Light** Standing still with arms at sides

Lifestyle

(LI)

Laundry (T5) Light Folding towels and putting them in a

laundry basket

Sweeping (T6) Light Sweeping confetti into piles

Leisure walk

(LW) Walking slow (T7) Light

Walking at a self-selected ‘slow’ pace

in a hallway

Brisk walk

(BW) Walking fast (T8) Moderate

Walking at a self-selected ‘brisk’ pace

in a hallway

Jogging

(JO) Jogging (T9) Vigorous

Jogging at a self-selected pace in a

hallway

Cycling

(CY) Cycling (T10)

Moderate/

Vigorous

Cycling on a cycle ergometer at a self-

selected cadence of 50-100 rpm with 1

kg resistance

Stair use

(SU)

Stair climbing and

descending (T11)

Moderate/

Vigorous

Walking up and down a flight of stairs

at a self-selected pace

Exercise

(EX)

Biceps curls (T12) Light Standing still while doing biceps curls

with a 3-lb. weight in each hand

Squats (T13) Moderate

With feet shoulder-width apart,

bending at the knees (to a 90° angle)

while holding an unweighted broom

behind the head

Non-wear

(NW)

Non-wear of

accelerometer (T14) N/A Not wearing the accelerometer

* Activity order, intensity, and duration (3-10 minutes) were left up to participants.

** Standing has traditionally been considered SB; however, recent literature suggests that standing

should be considered light-intensity instead of SB due to the differential physiologic effects of

standing as compared to sitting/lying (Owen, Healy et al. 2010).

Page 97: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

82

The 14 activities were performed in a 90-minute, simulated free-living setting which took

place in a laboratory room inside the Human Energy Research Laboratory and a hallway and

stairwell outside the laboratory. A list of the activities was written on a whiteboard for participants

at the beginning of the visit and a description of each activity was given. The order of activities

on the whiteboard was altered every 4-5 participants in order to avoid ordering effects during the

visit. Participants completed each of the 14 activities for a total of at least three minutes and for no

more than 10 minutes, but the order, intensity, and timing of the activities were left up to each

participant. A research assistant observed and recorded each activity on a handheld computer

while it was being performed and periodically updated participants on which activities they still

needed to complete. The non-wear activity was saved until the end of the 90-minute protocol so

that participants would not spend a significant portion of the 90-minute protocol trying to remove

and reattach the accelerometers. Upon completion of the protocol, participants were given a $35

Target® gift card.

Data reduction and modeling

Artificial neural networks

ANNs are nonlinear models which take a set of inputs x1…xk and use them to predict a

certain output variable y (e.g., EE or activity type), where k is the number of features used to

predict y. An ANN designed to predict EE was developed for each accelerometer. Figure 3.1

shows a graphical depiction of the ANN. The general form of an ANN model can be seen in

Equation 1.

Equation 1: ∑ [ ( ∑ )]

Page 98: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

83

In Equation 1, w are the weights that need to be estimated, ( )

(which is a linear

function), H is the size of the hidden layer, and y is EE in METs. In accordance with previous

research (Preece, Goulermas et al. 2009; Staudenmayer, Pober et al. 2009; Trost, Wong et al.

2012), our ANNs contained only one hidden layer.

Page 99: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

84

Figure 3.1. ANN for predicting EE.

Legend for Figure 3.1

The input layer contains the features used as input variables

*The Hidden Layer contains 15 hidden units, but only three are shown for simplicity.

Accelerometer signal features (one of each per axis, three total of each per accelerometer)

1. Mean = mean 2. Var = variance

3. Cov = covariance 4. Min = minimum

5. Max = maximum 6. MeanOR = mean accelerometer orientation

7. VarOR = variance of

accelerometer orientation

8. 10th %ile = 10

th percentile

9. 25th

%ile = 25th percentile 10. 50

th %ile = 50

th percentile

11. 75th

%ile = 75th percentile 12. 90

th %ile = 90

th percentile

Participant characteristics features

13. Ht = participant height 14. Wt = participant weight

15. Gender = participant gender

Non-feature abbreviations

S = summations of the input layer in the hidden units

U = activation function for the hidden layer

W1 = the weight vectors for each of the inputs

W2 = the weight vectors for each of the summations

Page 100: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

85

The ANNs were created and tested using a leave-one-participant-out approach. In this

approach, the ANN was first created from a ‘training’ data set, where the input features and the

outcome variable (EE) were used to estimate the weights for each input feature. This training set

consisted of the data from all but one participant in the study. Then, the ANN was tested on the

data from the participant left out of the training phase. This testing was conducted by supplying

the input features and comparing the predicted EE from the ANNs to the measured EE from the

criterion measure (Oxycon metabolic analyzer). This process was conducted with each

participant’s data used as the testing data once, therefore obtaining an ANN for each participant in

the study. Weights determined from each iteration of the leave-one-participant-out validation were

averaged to obtain a final ANN. This process was conducted separately for each accelerometer,

resulting in four distinct ANNs.

There were three additional considerations that were addressed in building our ANNs: 1)

window length, 2) relevant features to use as input variables, and 3) size of the hidden layer.

Window length

In order to analyze accelerometer data, it must first be divided into smaller segments, called

‘epochs’ or ‘windows,’ for analysis. By dividing the data into windows, EE can be assessed

separately for each window to yield information on activity type, duration, intensity, etc. Windows

of 60 seconds are commonly used for analyzing accelerometer data because outputting a given EE

every minute is intuitively appealing and works well for steady-state activities (Staudenmayer,

Pober et al. 2009; Freedson, Lyden et al. 2011). Additionally, longer windows (i.e., 30-60

seconds) increase the amount of information available with which to determine activity type and

have been shown to improve EE prediction accuracy (Trost, Wong et al. 2012). Finally, early

Page 101: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

86

accelerometers had limited data storage, so acceleration data had to be stored in 60-second

windows in order to be able to record data for a period of several days.

Sixty-second windows work well for laboratory-based protocols where participants

perform activities for a specific amount of time at steady state and then change to the next activity

at known intervals (e.g. every five minutes) (Trost, Wong et al. 2012). However, these long

windows may be less optimal in free-living situations, where steady-state EE is rarely achieved for

physical activities and where activities rarely start or end exactly on the minute (Orendurff, Schoen

et al. 2008; Lyden, Keadle et al. 2013). Other studies have shown similar or better accuracy of

measurement of EE in adults when using shorter epochs (Gabriel, McClain et al. 2010; Ayabe,

Kumahara et al. 2013; Orme, Wijndaele et al. 2014); therefore, we chose to use 30-second

windows in the current study.

Features

As mentioned previously, the activity counts variable is commonly used as an input for

linear regression equations used to measure EE and activity intensity. However, contained within

activity counts are useful data ‘features’ that can be extracted and used in either linear regression

models or machine learning algorithms. There are several different types of features that can be

used as input variables. Time-domain features are most commonly used because they can be

directly extracted or computed from accelerometer signal data. Examples of time-domain features

are mean, standard deviation, skewness, or percentiles of the acceleration signal. In addition to

being directly available from accelerometer data, many time-domain features are easy to

understand and interpret (Preece, Goulermas et al. 2009; Staudenmayer, Pober et al. 2009). The

other main type of features, frequency-domain features, can be used either in conjunction with or

Page 102: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

87

independent from time-domain features, yielding similarly high accuracy for activity type

classification as time-domain features in some studies (Preece, Goulermas et al. 2009; Mannini,

Intille et al. 2013). However, frequency-domain features require additional steps such as framing

the data, complex mathematical transformations, and filtering, and their calculation requires

significant computational power (Preece, Goulermas et al. 2009). Additionally, several studies

provide evidence that time-domain features can be used to achieve high activity classification (70-

90% from a single accelerometer) and EE predication accuracy without use of frequency-domain

features (Herren, Sparti et al. 1999; Staudenmayer, Pober et al. 2009; Dong, Montoye et al. 2013;

Montoye, Dong et al. 2013). Other than time- and frequency-domain features, simple descriptive

features, such as accelerometer orientation or participant demographic variables, can also be used

and may improve EE measurement accuracy.

Many accelerometer signal features have been used in previous research, and the models

created have varied considerably in complexity and measurement accuracy. While adding more

features may improve accuracy of the ANN, it may also lead to overfitting ANNs to the data used

for training the ANN, resulting in poorer generalizability of the model when applied to a new

population (Preece, Goulermas et al. 2009). Additionally, a major drawback of many machine

learning models is that while they tend to have high measurement accuracy, model complexity can

quickly render them unusable for anyone who lacks considerable knowledge of mathematics or

computer science and/or without access to expensive computing software (Pober, Staudenmayer et

al. 2006; Rothney, Neumann et al. 2007; Staudenmayer, Pober et al. 2009). Thus, this study

focuses on using easy-to-compute features as input variables and identifying a small number of

these features that can achieve high measurement accuracy.

Page 103: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

88

Before computing features, the 40 Hz data from the ActiGraph accelerometers were

reintegrated to 20 Hz for comparison with the data from the GENEA. Table 3.2 provides a list of

calculations for the 39 features tested and used in the analyses. Calculation and extraction of the

accelerometer features were performed in Microsoft Excel. The 36 accelerometer features (12

features for each of three accelerometer axes) are all time-domain features that have been

effectively utilized in previous studies; additionally, weight, height, and gender were included to

account for demographic characteristics of participants. For the EE prediction, the 30-second

windows allow 600 accelerometer signal samples for calculating the features (20 samples/second x

30 seconds). Mean, variance, covariance, minimum, maximum, mean and variance of monitor

orientation, and the 10th, 25

th, 50

th, 75

th, and 90

th percentiles were calculated separately for x-, y-,

and z-axes. These features were chosen to allow the ANN sufficient data to accurately predict EE.

After creating the ANNs using all 39 features, follow-up analyses were conducted to determine an

optimal subset of features that reduced complexity of the ANNs with minimal loss of accuracy. In

all, we used and compared five different sets of features. These feature sets can be seen in Table

3.3. Two feature sets (sets 2 and 5 in Table 3.3) were similar to those used successfully in previous

studies for EE prediction (Staudenmayer, Pober et al. 2009; Dong, Biswas et al. 2013). Feature set

1 was the full set, consisting of many potentially important characteristics of the acceleration signal

that have also been used in other studies, but not necessarily in the same combinations (Preece,

Goulermas et al. 2009). From the 36 accelerometer features used in set 1, correlations were

computed among the features to determine and remove redundancy in information available from

the features (Rothney, Neumann et al. 2007). As with linear regression, highly correlated input

variables can cause collinearity in the ANNs and may reduce their generalizability. Therefore, we

chose two features from each accelerometer axis that were poorly correlated with each other (mean

Page 104: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

89

and variance) and then used a stepwise approach to select features that had correlations of less than

r=0.70 with features already included in the set. Using this approach, we arrived at feature set 3,

consisting of mean, variance, minimum, and maximum of the acceleration signal (for each axis of

measurement) as well as participant weight, height, and gender (15 total features). Finally, in

feature set 4 we initially included lag-one autocorrelation, which has been used in many other

studies since it can yield valuable information on the temporal nature of activities by assessing the

correlation of two adjacent windows of acceleration data. However, the calculation of

autocorrelation involves dividing by the variance of the acceleration data within the adjacent

windows, which for many sedentary activities is 0 and results in an invalid calculation. Other

studies using autocorrelation have used automatic rules for classifying EE of sedentary activities

(i.e., Trost et al. automatically assigned all windows with an invalid lag-1 autocorrelation a MET

value of 1.0) (Trost, Wong et al. 2012), but we feel that this approach is limited since different

sedentary activities may elicit slightly different EE. Instead of using lag-one autocorrelations with

an automatic classification scheme for all sedentary activities, we calculated covariance as a

feature since covariance is simpler to calculate, is defined even when variance is zero, and can

provide information regarding similarity of the accelerometer signals of adjacent data windows

(similar to autocorrelation). These feature sets were tested including and excluding the three

participant characteristics (weight, height, and gender) in order to determine if demographic

characteristics would improve accuracy of the models.

Page 105: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

90

Table 3.2. Features used for EE prediction.

Feature

number

Feature used Formula for calculating feature in each 30-

second window

1-3* Mean acceleration signal (

)

4-6* Variance of acceleration signal

∑ ( )

7-9* Covariance of acceleration signal ∑ ( ) ( ( ) )]

10-12* Minimum of acceleration signal ( )

13-15* Maximum of acceleration signal ( )

16-18* 10th percentile of acceleration

signal For every 600 accelerations, arrange in order from smallest to largest and pick the 60

th

value

19-21* 25th percentile of acceleration

signal For every 600 accelerations, arrange in order from smallest to largest and pick the 150

th

value

22-24* 50th percentile of acceleration

signal For every 600 accelerations, arrange in order from smallest to largest and pick the 300

th

value

25-27* 75th percentile of acceleration

signal For every 600 accelerations, arrange in

order from smallest to largest and pick the 450th

value

28-30* 90th percentile of acceleration

signal For every 600 accelerations, arrange in

order from smallest to largest and pick the 540th

value

N/A Accelerometer orientation

(needed for calculating features

31-36) ( )

(

√(

)

)

31-33* Mean accelerometer orientation (

)

34-36* Variance of accelerometer

orientation ∑ ( )

37 Participant height N/A

38 Participant weight N/A

39 Participant gender N/A

The * signifies that one feature is included for each of the three accelerometer axes. The formulas

shown are for the x-axis, but the formulas for the y-and z-axes are similar. Ax is the acceleration in

the direction of the x-axis.

Page 106: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

91

Table 3.3. Feature sets used for creation and testing of ANNs.

Feature set number Features used Total number of features used

1 Mean, variance, covariance,

minimum, maximum, mean

orientation, variance of orientation,

and 10th, 25

th, 50

th, 75

th, and 90

th

percentiles of acceleration signal,

weight, height, and gender

39 (12 accelerometer features

per axis * 3 axes + weight +

height + gender)

2 Mean and variance of acceleration

signal, weight, and height

9 (2 accelerometer features per

axis * 3 axes + weight + height

+ gender)

3 Mean, variance, minimum, and

maximum of acceleration signal,

weight, height, and gender

15 (4 accelerometer features per

axis * 3 axes + weight + height

+ gender)

4 Mean, variance, covariance,

minimum, and maximum of

acceleration signal, weight, height,

and gender

18 (5 accelerometer features per

axis * 3 axes + weight + height

+ gender)

5 10th

, 25th, 50

th, 75

th, and 90

th

percentiles of acceleration signal,

weight, height, and gender

18 (5 accelerometer features per

axis * 3 axes + weight + height

+ gender)

Size of the hidden layer

As with the number of features used, more hidden units in the hidden layer allows for more

flexibility in the ANNs, allowing the model to better fit the training data. However, having more

units also increases the chances of overfitting. There is no consensus on the optimal number of

hidden units to use, but some investigators have used a number of hidden units similar to the

number of activities being identified and/or the number of input features used (Preece, Goulermas

et al. 2009; De Vries, Garre et al. 2011). Since our aim is to minimize the number of features used

and since our study contains 14 activities, we chose to use 15 hidden units in our hidden layer.

Page 107: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

92

Oxycon data

In a previous study by members of our research group, we reintegrated breath-by-breath

Oxycon portable metabolic analyzer data into 10- and then 15-second windows for analysis.

However, with both windows we found that data loss occurred in participants with slower

breathing rates (especially during sedentary activities), resulting in our reintegrating the data into

30-second windows for our final analysis (Montoye, Dong et al. 2014). Correspondingly, breath-

by-breath Oxycon data from the simulated free-living protocol were reintegrated into 30-second

windows for measurement of EE in the current study. These 30-second windows of accelerometer

data were used for training the ANNs to predict EE (as described earlier). Also, when testing the

EE ANNs, 30-second windows were used for computing predicted EE for comparison to Oxycon-

measured EE. Since the Oxycon recorded continuously and was not dependent on correctly

identifying an activity type, all data, including transitions, was included for training and testing of

the ANNs.

Statistical analyses

After downloading the accelerometer and Oxycon data, all data processing was conducted

in Microsoft Excel (Microsoft Corporation, Redmond, WA), and ANN creation was performed

using the R statistical software package (R-project, Vienna, Austria) . We chose to use Microsoft

Excel for data processing and R for our ANN creation in accordance with our intent to create and

use ANNs using simple methodology which can be used by those without extensive computer

programming skills or access to expensive computing software. Microsoft Excel is a very

commonly used and widely accessible software package for personal computing, and R is

relatively simple to use and is manageable to learn and use for researchers who may have limited

Page 108: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

93

statistical or computing experience. Additionally, R is an open-source software which is freely

available for download and has a special ANN library which can be used for development and

testing of ANNs . Thus, development and application of ANNs in R is less costly and much less

complicated than machine learning algorithms developed in previous studies (Pober,

Staudenmayer et al. 2006; Rothney, Neumann et al. 2007; Preece, Goulermas et al. 2009; Mannini

and Sabatini 2010), and R has been used successfully for creation of ANNs for predicting EE and

activity type (Staudenmayer, Pober et al. 2009; Lyden, Keadle et al. 2013).

Three summary statistics were calculated in order to test the accuracy of each ANN for

predicting EE: Pearson correlations, root mean square error, and bias. Operational definitions of

these three measures are given below.

Pearson correlations (r): The covariance of two variables is divided by the product of the standard

deviation of the two variables to obtain r. The range of possible r values is -1 to 1, with 1 being a

perfect correlation and -1 being a perfect inverse correlation (Field 2009). A minimum correlation

of r=0.60 has been defined as moderately high validity in the literature; therefore, we desire to

obtain a correlation of r≥0.60 between predicted EE and Oxycon-measured EE (Safrit and Wood

1995). If this minimum correlation was not met, we would have increased the window length and

added additional features to improve correlations to and meet our desired correlation.

Root mean square error (RMSE): The square root of the mean squared difference between values

predicted by an estimator (the ANNs) and the true values (measured by the criterion measure) is

the RMSE. Smaller RMSE values represent better prediction of the ANNs; thus, our goal was to

minimize RMSE to maximize accuracy of the ANNs.

Page 109: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

94

Bias: Bias is the difference between the estimated value of a measure and the true value. Bias

allows for determinations of systematic over- or underprediction of EE; a negative bias represents

underestimation of EE by the ANNs, and a positive bias represents overestimation. We desired to

bias achieve bias values close to 0 in order to maximize accuracy of the ANNs.

Correlations, RMSE, and biases were calculated separately for each of the four

accelerometers and each of the five different feature sets. Differences among correlations, RMSE,

and biases among the four accelerometers were assessed using repeated-measures analysis of

variance (RMANOVA). Additionally, differences among feature sets were evaluated using

RMANOVA. Since correlations tend to be negatively skewed, we first performed a Fisher’s Z

transformation to normalize the correlations before performing the RMANOVA. When the

RMANOVA revealed statistically significant differences for any of the three analyses, post hoc

dependent t-tests were conducted to determine differences among monitor placements or feature

sets. The a priori Alpha level was set at P<0.05 for determining statistical significance. Statistical

analyses were performed using SPSS version 22 (IBM Corporation, Armonk, NY).

Power analysis

In the simulated free-living setting, correlations of r≥0.60 between measured EE and

estimated EE from the four ANNs were desired to indicate moderately high validity of the

accelerometers for EE estimation (Safrit and Wood 1995). Table 3.4 shows the minimum

correlation that could have been detected with different sample sizes and power. For example,

with 20 participants, power is 80% to detect a correlation of 0.591. Thus, our sample size of 44

was well above the minimum required number of 25 needed to detect our minimum desired

correlation (0.60) with greater than 90% power. We chose to oversample in order to ensure

Page 110: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

95

adequate sample size due to the potential for occasional malfunction and/or loss of battery power

of the accelerometers or the portable metabolic analyzer experienced in a previous study by

members of our research group (Montoye, Dong et al. 2014).

Table 3.4. Minimum Pearson correlations detectable for a given sample size and power.

Sample

size

80%

power

90%

power

18 0.619 0.684

20 0.591 0.656

24 0.545 0.609

30 0.492 0.554

36 0.452 0.511

42 0.395 0.477

Page 111: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

96

RESULTS

Malfunction of the Oxycon metabolic analyzer (due to a bad battery) occurred in three

participants, and accelerometer malfunction occurred in another two participants. These

participants were excluded from further analyses, resulting in 39 included in model creation and

validation. Means and standard deviations (SD) for participant characteristics (both those included

in and excluded from analysis) are shown in Table 3.5. Although weight and BMI appeared higher

in females excluded in the final analysis, these differences were not statistically significant. Of the

39 participants included in the analysis, 13 were either overweight or obese according to BMI (≥25

kg/m2). Additionally four of the 39 participants included in the final analysis were left-hand

dominant, with the remaining 35 being right-hand dominant.

Table 3.5. Demographic characteristics of participants enrolled in study.

Included in analysis Excluded from analysis

Mean (SD) All

(n=39)

Males

(n=19)

Females

(n=20)

All

(n=5)

Males

(n=3)

Females

(n=2)

Age (years) 22.1 (4.3) 23.7 (5.0) 20.5 (2.7) 21.2 (2.9) 21.3 (4.0) 21.0 (1.4)

Weight (kg) 72.4 (16.2) 84.5 (13.1) 60.8 (8.9) 78.2 (21.2) 75.9 (4.9) 81.6 (41.4)

Height (cm) 171.4 (10.1) 179.1 (7.7) 164.1 (5.7) 167.8 (9.8) 175.9 (0.4) 157.1 (2.4)

BMI (kg/m2) 24.4 (3.6) 26.3 (3.4) 22.5 (2.6) 28.0 (9.1) 24.8 (1.7) 32.8 (15.8)

In initial testing of the five feature sets, it was found that the addition of weight, height, and

gender yielded no gains in predictive accuracy of the ANNs. Therefore, these features were

removed when training and testing the ANNs. Correlations for predicted EE are shown in Table

3.6. With correlations ranging from r=0.82-0.89 for the four accelerometers across the five sets of

features, all four monitors achieved correlations well above the r=0.60 desired to indicate

moderately high validity. The RMANOVA test among accelerometer placement sites revealed a

test statistic of F=4.36, indicating significant differences among the four placement sites. Post-hoc

Page 112: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

97

tests revealed that the ActiGraph thigh accelerometer had higher correlations with measured EE

(r=0.88-0.89) than the wrist accelerometers for all five feature sets (r=0.82-0.86) and higher

correlations than the hip accelerometer (r=0.83-0.88) for all sets other than set 1 (which included

all 39 features). Correlations achieved by the left and right wrist accelerometers were similar for

each of the five feature sets..

When comparing accuracy achieved among the five sets of features, the thigh monitor

accuracy was not affected by choice of feature set. Conversely, for the hip accelerometers, feature

sets 2-5 resulted in slightly lower correlations; similarly, correlations dropped for the wrist

accelerometers for feature sets 2-4 but not for set 5. Despite the statistical significance of the

decreased correlations seen with the hip and both wrist accelerometers, the actual drop in

correlations was quite small, especially for feature sets 3-5.

Table 3.6. Correlations of measured vs. predicted EE.

Correlations

(SD) ActiGraph

Hip

ActiGraph Thigh GENEA Left

Wrist

GENEA Right

Wrist

Set 1 (All

accelerometer

features)

0.88 (0.05) 0.89 (0.07) 0.86 (0.05)*

0.86 (0.06)*

Set 2 (Mean,

Var)

0.83 (0.06)^ 0.88 (0.09)&

0.82 (0.06)*^ 0.82 (0.08)*^

Set 3 (Mean,

Var, Min, Max)

0.86 (0.04)*^

0.89 (0.08)&

0.84 (0.05)*^ 0.83 (0.06)*^

Set 4 (Mean,

Var, Cov, Min,

Max)

0.86 (0.06)*^

0.89 (0.10)&

0.84 (0.06)*^ 0.85 (0.06)*^

Set 5 (10th

, 25th

,

50th

, 75th

, 90th

percentiles)

0.87 (0.04)*^ 0.89 (0.05) 0.86 (0.05)* 0.86 (0.06)*

The * indicates significant differences from thigh accelerometer placement site.

The & indicates significant differences from hip accelerometer placement site.

The ^ indicates significant difference from feature set 1 (all features).

Page 113: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

98

Root mean square error (RMSE) values for predicted vs. measured EE the four

accelerometers are shown in Figure 3.2. The RMANOVA test revealed a test statistic of F=3.64,

indicating significant differences in RMSE among placement sites. For all five feature sets, the

thigh accelerometer placement (1.05-1.14 METs) had significantly lower RMSE values than the

hip (1.12-1.42 METs), left wrist (1.18-1.36 METs), and right wrist (1.18-1.38 METs)

accelerometer placements. Moreover, when comparing among the five feature sets, the RMSE for

the thigh was not significantly different for any of the five. Conversely, RMSE values with the hip

accelerometer placement were significantly higher with feature sets 2-5 than set 1. Similarly, with

the two wrist accelerometer placement sites, RMSE was significantly higher with feature sets 2-4

than set 1, although feature set 5 yielded similar RMSE to set 1. There were no differences in

RMSE values between left and right wrists.

Figure 3.2. RMSE values for predicted vs. measured EE.

The * indicates significant differences from other accelerometers.

The ^ indicates significant difference from feature set 1 (all 38 features).

For interpretation of the references to color in this and all other figures, the reader is referred to the

electronic version of this dissertation.

00.10.20.30.40.50.60.70.80.9

11.11.21.31.41.51.61.71.81.9

1 2 3 4 5

RM

SE (

MET

s)

Feature Set

ActiGraph Hip

ActiGraph Thigh

GENEA L. Wrist

GENEA R. Wrist

*

* * *

^ ^ ̂

*

^ ^ ̂ ^

^ ̂ ^

Page 114: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

99

Average biases for each accelerometer can be seen in Table 3.7. The RMANOVA test

statistic was F=0.062, indicating no overall bias for any of the four monitor placements or for any

of the five feature sets. This lack of bias indicates that none of the accelerometers had an overall

overestimation or underestimation of EE in the total sample.

Table 3.7. Bias for measured vs. predicted EE.

Bias (SD) ActiGraph

Hip

ActiGraph

Thigh

GENEA Left Wrist GENEA Right Wrist

Feature Set 1 0.01 (0.35) -0.01 (0.34) -0.02 (0.32) -0.01 (0.35)

Feature Set 2 0.02 (0.59) 0.03 (0.42) 0.01 (0.41) -0..03 (0.49)

Feature Set 3 -0.03 (0.46) 0.01 (0.32) -0.03 (0.35) 0.00 (0.47)

Feature Set 4 0.05 (0.44) 0.01 (0.35) 0.00 (0.48) -0.01 (0.48)

Feature Set 5 0.05 (0.43) -0.05 (0.29) 0.01 (0.46) 0.05 (0.47)

Page 115: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

100

DISCUSSION

The purposes of this study were 1) to validate accelerometers worn on the wrists and thigh

for prediction of EE, 2) to compare the accuracy of EE prediction for accelerometers located on

the wrists, thigh, and hip, and 3) to compare accuracies of the left and right wrists, and 4) to use

simple input features to maximize prediction accuracy while minimizing complexity of the

machine learning technique.

Our results showed strong correlations between measured and predicted EE for all

accelerometer placements and for all five feature sets. Also, our results indicated no systematic

bias by any of the accelerometer placements for estimating EE. Overall, the thigh-mounted

accelerometer provided the highest correlations with measured EE and also the lowest RMSE of

the placement sites. With the full feature set (set 1), the thigh- and hip-mounted accelerometers

provided similar EE prediction accuracy, but the thigh performed better when subsets of the full

feature set were tested. Additionally, the thigh-mounted accelerometer performance was not

diminished for any of the five feature sets tested, meaning that even very simple inputs such as

mean and variance of the acceleration signal can be used to predict EE with a high degree of

accuracy. Given previous work showing high accuracy for measuring sedentary behavior and

ambulatory activities with thigh-mounted accelerometers (Grant, Ryan et al. 2006; Ryan, Grant

et al. 2006; Skotte, Korshoj et al. 2012), the results of this study further illustrate the utility of the

thigh as a highly accurate placement site for activity and EE measurement.

Despite the superiority of the thigh-mounted accelerometer, it is worth emphasizing that

the two wrist-mounted accelerometers provided only slightly lower accuracy than the thigh and

comparable accuracy to the hip, resulting in high overall prediction accuracy of all four monitors.

Page 116: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

101

Our finding of high prediction accuracy for the wrist accelerometer placement sites lies in

contrast to studies that have used linear regression-based approaches for estimating EE. In the

early days of activity monitors, Montoye et al. found significantly higher correlations for

predicting EE using a hip-mounted motion sensor (r=0.71) compared to a wrist-mounted

accelerometer (r=0.40) during ambulatory and exercise activities (Montoye, Washburn et al.

1983). Similarly, Swartz et al. found that in a simulated free-living setting, the hip-mounted

accelerometer estimated EE with a moderate correlation of r=0.56, while the wrist-mounted

accelerometer had a very poor correlation (r=0.18) with measured EE (Swartz, Strath et al.

2000). Finally, as recently as 2013, Rosenberger et al. found higher correlations (r=0.72 vs.

r=0.36) and lower error (0.55 vs. 0.85 METs) when predicting EE from a hip-mounted

accelerometer compared to a wrist-mounted accelerometer (Rosenberger, Haskell et al. 2013). It

is important to note that these studies all used linear regression for their modeling technique; the

consistent superiority of the hip to the wrist when linear regression is used is not surprising given

that hip monitor records movement of the trunk, while wrist monitors record arm movement that

may or may not be coupled with movement of the rest of the body, resulting in poor correlations

of activity counts and EE.

A significant advantage of machine learning is its ability to recognize patterns in an

acceleration signal rather than simply using magnitude of acceleration for prediction. Recent

studies by Mannini et al. (2013) and Zhang et al. (2012) show very high activity classification

accuracies (85-97%) using a wrist accelerometer coupled with machine learning models, giving

strong reason to believe that machine learning would also allow for high accuracy for EE

prediction (Zhang, Rowlands et al. 2012; Mannini, Intille et al. 2013). The results of the current

study support the utility of machine learning modeling as a viable approach to analyzing wrist-

Page 117: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

102

mounted accelerometer data and provide further evidence of the superiority of machine learning

to linear regression for modeling of accelerometer data. Additionally, while the current

convention is for wrist accelerometers to be worn on the non-dominant wrist, the results of this

study support that wrist choice will not affect accuracy for estimation of EE. A 2012 study by

Zhang et al. found that classification accuracy for identifying four types of activities was 97%

and 96% for left and right wrist accelerometers, respectively, further supporting the idea that

choice of wrist placement will not affect measurement accuracy.

The high accuracy of wrist-mounted accelerometers for EE prediction found in this study

is especially encouraging given the utility of the wrist location for measuring sleep quality as

well as its current use in large surveillance studies such as NHANES (Troiano and McClain

2012; Troiano, McClain et al. 2014). Additionally, wrist-mounted accelerometers are

comfortable to wear and can be designed/disguised to look like watches, both of which may lead

to improved compliance. With the ability to accurately measure sleep as well as activity type

and EE, the wrist may represent an ideal blend of practicality and measurement accuracy for

monitoring lifestyle behaviors and patterns. Of note, the left and right wrist accelerometer

placements achieved equally high accuracies for prediction of EE, which provides evidence that

the popular convention for an accelerometer to be placed on the non-dominant wrist may be

unnecessary.

The hip-mounted accelerometer achieved correlations of r=0.83-0.88 and RMSE values

of 1.12-1.42 METs with the different feature sets, and these statistics compare favorably to those

achieved in previous studies. In a study conducted in a laboratory-based setting, Staudenmayer

et al. found that an ANN developed using data from a hip-mounted accelerometer predicted EE

with an RMSE of 1.22 METs. This RMSE represented an improvement of 32-71% over

Page 118: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

103

previously developed linear regression approaches tested in the study (Staudenmayer, Pober et

al. 2009). Additionally, their input features were very similar to our feature set 5 (the 10th

, 25th

,

50th

, 75th

, and 90th

percentiles of the acceleration signal), lending additional support that this

feature set is viable for use in different settings and populations. Similarly, Lyden et al. achieved

intraclass correlation coefficients above 0.90 and RMSE values of 1.00 METs for predicting EE

using ANNs developed from hip-mounted accelerometer data in a true free-living setting, again

achieving superior accuracy when compared to linear regression approaches (Lyden, Keadle et

al. 2013). In another study, Rothney et al. achieved a correlation of r=0.92 and RMSE of 0.50

METs when predicting EE using an ANN developed from a hip-mounted accelerometer in a

simulated free-living setting. Their slightly better accuracy is likely due to study design,

especially given that their use of a linear regression approach to EE prediction yielded a

correlation of r=0.89 and an RMSE of 1.00 METs, both of which are considerably better than

accuracy achieved in other studies (Hendelman, Miller et al. 2000; Swartz, Strath et al. 2000;

Staudenmayer, Pober et al. 2009). Despite the slightly higher RMSE values achieved by the

ANNs in our study, our results are encouraging given that participants averaged an intensity of

3.3 METs across the duration of the protocol, which is higher than many other studies and likely

contributes to higher RMSE, as seen in previous work by our research group (Montoye, Dong et

al. 2014). Taken together, these studies reinforce the high accuracy for EE prediction achievable

using machine learning techniques on data from a single, hip-mounted accelerometer, both in

laboratory-based and free-living settings.

Our final objective in the current study was to use relatively simple methods for feature

extraction and ANN creation and compare sets of input features in order to identify relevant

feature sets that allow for high measurement accuracy while minimizing the complexity of the

Page 119: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

104

ANN, both in its structure and in its creation. In order to achieve the first part of this objective,

all data cleaning and feature extraction were conducted in Microsoft Excel. Features were

calculated and extracted using simple functions already built into Excel. While this method is

somewhat labor intensive, the key strength of this approach is that it is a viable method for

feature extraction without knowledge of or access to powerful, complicated software packages.

Use of macros in Excel requires additional knowledge of the software package but can also

streamline the process of feature extraction. Additionally, ANN creation was conducted using R

statistical software, which is a freely available, open-source software package. Writing programs

in R is complex and requires skill, but implementing programs which have already been written

is relatively simple and can be accomplished with knowledge of only a few commands in R. Use

of the nnet package for creating ANNs has been successfully accomplished by Staudenmayer et

al. and Lyden et al., and considerable detail of the approach, including some of the code for

creating and testing the ANNs, can be found in their manuscripts (Staudenmayer, Pober et al.

2009; Lyden, Keadle et al. 2013).

To address the second part of our objective to simplify use of ANNs, we sought to define

an optimal subset of features that can be used without sacrificing measurement accuracy. For the

thigh accelerometer, we found that choice of features had minimal impact on measurement

accuracy, even in the simplest feature set (set 2) consisting of only mean and variance of the

acceleration signal. A very similar set of features was used in a study by members of our

research group, in which they were able to classify 14 activities with an accuracy above 78%

with a thigh accelerometer (Dong, Montoye et al. 2013). Therefore, this minimal feature set

appears to provide strong accuracy for both activity type classification and EE prediction when

using a thigh-mounted accelerometer. For the hip-mounted accelerometer, feature sets 2-5 all

Page 120: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

105

provided slightly lower prediction accuracy than set 1, although the drop in accuracy was very

small (especially for sets 3-5) and may be of little practical significance. Finally, with the two

wrist-mounted accelerometers, feature sets 2-4 resulted in significantly higher RMSE values and

lower correlations with measured EE compared to the other four feature sets, although these

differences were also small. Additionally, feature set 5 provided similar measurement accuracy

to set 1. The choice of features to use in a predictive model will be dependent on the emphasis

on accuracy vs. the feasibility for use. For studies with an emphasis on accuracy of

measurement, the larger feature sets used with the wrist and hip placements yielded better

accuracies than the ANNs developed from the smaller feature sets. On the other hand, the

simplest ANNs developed for the thigh placement were able to predict EE with similar accuracy

to the largest feature set. Also, the simplest models (i.e., feature set 2, which included only mean

and variance of the acceleration signals) can be used with high accuracy and RMSE within 25%

of that achieved with the largest feature set (set 1) for the wrist and hip accelerometer

placements. Therefore, these smaller feature sets may be more appropriate for use in large-scale

studies, where ease of use of the predictive models is of utmost importance.

Taken together, the findings of this study support the use of simple-to-compute

acceleration features for achieving highly accurate estimates of free-living EE using machine

learning. Moreover, choice of the number and type of features appears to alter EE prediction

accuracy slightly, but the practical significance of these small differences is likely minimal,

indicating that researchers may be able to use ANNs with only a few, simple-to-compute

accelerometer features and achieve high measurement accuracy.

Page 121: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

106

Study limitations and strengths

There were several limitations of the current study. First, study participants represented a

fairly homogenous group of college-age adults. Thus, our findings are not necessarily

generalizable to older populations or children/adolescents and require further validation before

use in these populations. Second, the use of a simulated free-living setting rather than a true

free-living setting could be viewed as a limitation since some studies have used a true free-living

setting for ANN creation and validation (Lyden, Keadle et al. 2013). Third, we did not measure

resting VO2, which is known to vary across individuals (Ferro-Luzzi 1968). However, like

creation of individual HR curves for improving the accuracy of EE prediction using HR, taking

individual resting EE into account results in dramatically increased burden on researchers and

participants; more importantly, individual resting EE measurement would limit the generalizability

of our findings since it is not often possible to measure resting EE in intervention or epidemiologic

studies, where accelerometers are often used. Instead of measuring resting VO2, it may be useful

to include variables such as age and fat-free mass into prediction models since these variables

account for the majority of variation in resting VO2 (Johnstone, Murison et al. 2005). However,

our study did not find that the inclusion of demographic variables such as weight, height, and

gender improve EE prediction when added as input features.. Last, we experienced some

difficulties with keeping thigh-mounted accelerometers in their proper location during the

protocol. Taping monitors on the thigh worked well initially but was less reliable once

participants started to sweat. We attempted to secure the monitor using an elastic strap, but this

often slipped throughout the session and was less comfortable to participants. There have been

several studies that have successfully used thigh-mounted accelerometers for PA and SB

measurement, and in future work we hope to communicate with other researchers regarding

Page 122: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

107

optimal strategies for mounting accelerometers on the thigh due to their high measurement

accuracy and ability to be worn inconspicuously (i.e., under clothing) to enhance compliance.

There are also several notable strengths of this study. First and foremost, we believe the

simulated free-living setting represents the best blend of exerting some control over participant

activities while still allowing considerable freedom for the order, intensity, and duration of

activities chosen by participants. Troiano et al. identified that PA tends to be performed in short

bouts, meaning that steady-state is rarely achieved during PA in free-living settings (Troiano,

Berrigan et al. 2008). This study provides rationale for the inclusion of transitions and non-

steady-state activities in our study since it is more similar to true free-living settings than a

typical laboratory-based validation.

A true free-living setting may theoretically have the most real-world generalizability, but

a major issue in true free-living settings is lack of a good criterion measure. Doubly labeled

water provides an accurate estimate of total EE but cannot measure activity EE or minute-to-

minute EE. Also, Lyden et al. used a true free-living setting for their ANN creation and

validation and direct observation as their criterion measure. Trained observers recorded

activities being performed and later used activity classification to predict EE using the

Compendium of Physical Activities (Ainsworth, Haskell et al. 2011). While this approach

probably represents the best possible criterion in a true free-living setting, it is limited in that the

Compendium is an estimate of activity EE and is not suitable for individual EE prediction. Also,

without imposing some structure in which participants must perform certain activities for a

minimum time, it is likely that participants will spend the majority of their time in activities such

as sitting and walking and minimal or no time performing other activities, limiting the

generalizability of ANNs created from these data. By utilizing a variety of activities across a

Page 123: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

108

wide range of intensities and including all transition data during the visit in our analysis, we

incorporated many advantages of a true free-living setting while also exerting enough control to

ensure that a variety of activities were performed. Additionally, in the simulated free-living

setting we were able to use a portable metabolic analyzer as our criterion measure, which is

widely used as a criterion measure for EE measurement. Another strength of the study was the

use of Microsoft Excel and R statistical software for all stages of data cleaning, feature

computation and extraction, and ANN creation and validation. These software programs are

widely available, and they can be used to create and test machine learning algorithms with

minimal experience in computational programming. Finally, it can sometimes be difficult to

compare results across studies due to differences in protocol, number and types of activities

performed, population used, and modeling approach(es) tested. By simultaneously using four

accelerometers, our study allows for direct comparisons of monitors worn on different places on

the body for accuracy in EE prediction.

Conclusions

In summary, our study provides strong preliminary evidence that machine learning

modeling allows for single accelerometers mounted on the thigh and wrists to provide highly

accurate estimates of EE in a simulated free-living setting. Thigh-mounted accelerometers

appear to perform with slightly better accuracy than hip- or wrist-mounted accelerometers,

although this difference is fairly small. Also, we have shown that choice of wrist (dominant vs.

non-dominant) does not affect accuracy of EE prediction. Finally, our study builds off the work

of others and highlights ways of reducing complexity of ANN model creation, hopefully

allowing for this approach to be used by a wider group of researchers with skills in areas other

than activity measurement. In future studies we plan to extend our comparison of different

Page 124: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

109

placement sites for accuracy of activity classification as well as measurement of SB and sleep

across different populations. Also, we plan to experiment with using data from multiple

monitors to further improve measurement accuracy over that achieved with a single monitor.

Finally, we intend to cross-validate the algorithms developed in the study in a true free-living

setting to provide support for their future use for EE prediction in epidemiologic or surveillance

research.

Page 125: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

110

CHAPTER 4

COMPARISON OF ACTIVITY TYPE CLASSIFICATION ACCURACY FROM

ACCELEROMETERS WORN ON THE WRISTS, HIP AND THIGH

ABSTRACT

The purpose of this study was to develop, validate, and compare the accuracy of activity type

prediction models for accelerometers placed on the wrist, hip, and thigh. Additionally, we

compared classification of activity type between accelerometers worn on the left and right wrists.

Finally, we compared prediction accuracies for specific categories of activities (e.g., sedentary

activities) METHODS: Forty four healthy adults participated in a 90-minute simulated free-

living activity protocol, in which participants performed a total of 14 activities (sedentary,

ambulatory, lifestyle, and exercise activities, standing, cycling, stairs, and non-wear) for 3-10

minutes each. The order, duration, and intensity of activities were dictated by participants and

recorded using direct observation (for a criterion measure of activity type). Four accelerometers

were worn (right and left wrists, right hip, and right thigh) in order to predict activity type using

artificial neural networks. The artificial neural networks were created using several sets of input

features in order to determine those most relevant to activity type prediction. Classification

accuracy of the artificial neural networks was evaluated using sensitivity, specificity, and area

under the curve, with direct observation used as the criterion measure of activity type.

RESULTS: The wrist accelerometers achieved the highest overall classification accuracies for

identifying all 14 activities (80.9-81.1%) as well as when similar activities were grouped into

categories (86.6-86.7%). Additionally, classification accuracies were similar between left and

right wrists. The hip accelerometer had the lowest overall classification accuracies (66.2-

72.5%), with the thigh accelerometer accuracy higher than the hip but lower than the wrists

Page 126: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

111

(71.4-84.0%). Sedentary, lifestyle, and exercise activities were detected best with the wrist

accelerometers, whereas the ambulatory activities had similar classification accuracies with all

four accelerometer placements. Unlike our previous work with energy expenditure prediction

(Chapter 3), more input features significantly improved classification accuracy.

CONCLUSIONS: A single accelerometer placed on the left or right wrist provided the highest

overall classification accuracy for activity type prediction as well as the highest accuracy for

sedentary, lifestyle, and exercise activity categories in a simulated free-living setting.

Page 127: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

112

INTRODUCTION

Objective measurement of physical activity (PA) and sedentary behavior (SB) is

important for determining relationships between these lifestyle behaviors and health indices,

identifying populations at high risk of having low PA and high SB levels, and evaluating the

effectiveness of interventions designed to increase PA and/or decrease SB. Because of the

interest (at both a population and personal level) in measuring PA and SB, many wearable

devices, such as heart rate monitors, pedometers, and accelerometers, have been used in an

attempt to quantify PA and SB. Accelerometers have emerged as the most popular method due

to their relatively low participant and researcher burden as well as high accuracy for measuring

physiologic variables such as energy expenditure and activity intensity (Welk 2002). Traditional

use of accelerometers has involved linear regression for predicting energy expenditure from

“activity counts”, which are pre-processed and filtered acceleration signals from an

accelerometer (Freedson, Melanson et al. 1998). However, in recent years the field has started to

move away from the count-based regression approach because linear regression is often

inadequate to capture the complex relationship of acceleration patterns and movement that

occurs in free-living settings (Hendelman, Miller et al. 2000; Swartz, Strath et al. 2000) and

cannot allow for determination of the type of activity being performed (Preece, Goulermas et al.

2009).

A large body of recent research has focused on machine learning, a pattern recognition

approach for modeling data, in order to predict energy expenditure as well as activity type using

features extracted from accelerometer data (Preece, Goulermas et al. 2009). Using machine

learning, researchers have achieved activity classification accuracies consistently over 70%

(Staudenmayer, Pober et al. 2009; Dong, Montoye et al. 2013) and often over 90% (Zhang,

Page 128: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

113

Rowlands et al. 2012; Cleland, Kikhia et al. 2013; Mannini, Intille et al. 2013; Skotte, Korshoj et

al. 2014) using data from a single accelerometer worn on different parts of the body. Although

accelerometers have traditionally been worn on the hip, machine learning has yielded high

activity classification accuracy from accelerometers worn on the wrist, thigh, and ankle (Cleland,

Kikhia et al. 2013; Dong, Montoye et al. 2013; Mannini, Intille et al. 2013; Skotte, Korshoj et al.

2014). Of the many placement sites tested in different studies, the wrist and thigh hold

significant promise in the context of machine learning approaches to data analysis.

Wrist-mounted accelerometers have been used successfully for sleep measurement (Jean-

Louis, Kripke et al. 2001) and are also being used in the 2011-2014 cycle of NHANES data

collection in the hope of improving compliance (Troiano and McClain 2012). Moreover, several

studies have achieved high accuracy for activity type classification with wrist accelerometers

(Zhang, Rowlands et al. 2012; Cleland, Kikhia et al. 2013; Mannini, Intille et al. 2013), further

demonstrating the utility of the wrist as a promising measurement site. The convention is for

wrist-mounted accelerometers to be worn on the non-dominant wrist, but it may be that this

convention is unnecessary. A study by Zhang et al. found similar classification accuracies from

accelerometers worn on the left and right wrists for four types of activities (sedentary, household,

walking, and running). However, it is unknown if other kinds of activities, especially activities

that may vary considerably between dominant and non-dominant hands (e.g., sweeping,

computer use, etc.), are detected with similar accuracy for each wrist. If choice of wrist

placement does not affect measurement accuracy, there may be important implications for

improving compliance and comfort with wrist-worn accelerometers.

Thigh-mounted accelerometers possess significant potential as a placement site due to

their high accuracy for measuring SB (Grant, Ryan et al. 2006; Kozey-Keadle, Libertine et al.

Page 129: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

114

2011; Lyden, Kozey Keadle et al. 2012) and high accuracy for prediction of energy expenditure

(Chapter 3). Studies have found that activity type classification accuracies from a thigh-mounted

accelerometer are similar to or higher than accuracies achieved with a wrist-mounted

accelerometer (Cleland, Kikhia et al. 2013; Dong, Biswas et al. 2013; Skotte, Korshoj et al.

2014), but it is unknown if the thigh can provide high measurement accuracies across a wide

variety of activities.

Despite the utility of the hip, wrist, and thigh as accelerometer placement sites, few

studies have directly compared activity classification accuracies among these sites to determine

their overall accuracy as well as classification of different types of activities such as sedentary,

ambulatory, or lifestyle activities. One study by Cleland et al. found classification accuracies

above 95% for accelerometers located on the hip, wrist, and thigh, but the small number of

activities performed and small number of participants limits our understanding of the advantages

and disadvantages of each placement site. Other studies by Dong et al. (Dong, Montoye et al.

2013) and Skotte et al. (Skotte, Korshoj et al. 2014) compare two of these three placement sites

but did not compare all three. Therefore, further research is needed to directly compare

classification accuracies of hip-, wrist-, and thigh-mounted accelerometers.

Another research gap is the lack of machine learning algorithm validation for activity

type classification in free-living settings. Most previous studies have been conducted in

laboratory-based settings with participants performing a set list of activities in a pre-specified

order, for a pre-specified period of time, and at a constant, pre-specified intensity (Cleland,

Kikhia et al. 2013; Dong, Montoye et al. 2013; Mannini, Intille et al. 2013; Skotte, Korshoj et al.

2014). These laboratory-based settings ensure that high control is exerted over the protocol and

can provide valuable insight as to the strengths and weaknesses of predictive algorithms for

Page 130: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

115

classifying different types and intensities of activities. However, the lack of variation allowed in

laboratory-based protocols makes the laboratory setting very different from a free-living

environment, where individuals are not constrained to a certain order, intensity, or timing of

activities. Previous work with cut-points as well as machine learning provides evidence that

predictive techniques validated in the laboratory perform with much lower accuracy when

applied to a free-living setting (Swartz, Strath et al. 2000; Gyllensten and Bonomi 2011; Lyden,

Keadle et al. 2013). In one such study, Lyden et al. (Lyden, Keadle et al. 2013) found that an

ANN created from laboratory-based activity data had a bias of over 33% when used to predict

energy expenditure from data collected in a free-living setting. Similarly, a study by Gyllensten

et al. showed that activity type machine learning algorithms developed in a laboratory had

classification accuracies 15-20% lower in a free-living setting than in the laboratory-based

setting in which they were created (Gyllensten and Bonomi 2011). Therefore, activity type

prediction algorithms need to be created and validated in a free-living setting in order to have

true utility for activity measurement in epidemiologic, surveillance, or intervention studies.

Given the current gaps that exist with regard to activity type classification, the purposes

of our study were 1) to develop and validate ANNs (using several sets of features) for prediction

of activity type from accelerometers worn on the wrists, hip, and thigh 2) to compare the activity

classification accuracies achieved among these accelerometer placement sites, 3) to compare the

overall activity classification accuracies of accelerometers placed on the left and right wrists and

4) to compare classification accuracies for specific activity types, activity categories (i.e.,

lifestyle, exercise, sedentary, and ambulatory activities), and activity intensities (i.e., sedentary,

light, etc.) using data collected in a simulated free-living setting.

Page 131: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

116

METHODS

Summary of protocol

Participants came to the Human Energy Research Laboratory to participate in a 90-

minute simulated free-living protocol, for which they performed a total of 14 sedentary,

ambulatory, lifestyle, and exercise activities. Each activity was performed for between 3-10

minutes, with the order, duration, and intensity of activities left up to participants. During the

protocol, participants wore four accelerometers, and the order and durations of their activities

were recorded by a trained observer and used as a criterion measure of activity type.

Participants

A total of 44 adults (22 male, 22 female) were recruited from the area surrounding East

Lansing, MI via email, flyers, and word of mouth for participation in this study. Participants

had to fulfill three criteria to be eligible for the study: 1) they had to be free of health

conditions preventing them from being able to safely perform moderate- or vigorous-intensity

activities, 2) they could not have an orthopedic limitations that would invalidate the use of

accelerometry for activity measurement, or 3) they had to fall within the age range of 18-44

years. Prior to participant recruitment, this study was approved by the Michigan State

University Institutional Review Board. All participants provided written informed consent

prior to their participation in the study.

Instrumentation

The activity monitors used in this study were ActiGraph GT3X+ accelerometers and

GENEActiv accelerometers. Additionally, an iPAQ portable digital assistant (PDA) computer was

Page 132: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

117

used by observers to record the activities performed during the protocol. The acceleration data for

all four accelerometers were time stamped and stored within the monitors and later were

downloaded to a computer for analysis. Additionally, the accelerometers were oriented so that the

x-axis was the vertical axis, the y-axis was the medial-lateral axis, and the z-axis was the anterior-

posterior axis. All accelerometers and the PDA were synchronized to an external clock prior to the

start of data collection. Descriptions of the accelerometers and PDA follow.

ActiGraph accelerometers

The ActiGraph (ActiGraph LLC, Pensacola, FL) is a commonly used, commercially

available accelerometer, and there is an abundance of literature regarding its reliability and validity

for measurement of PA (Freedson, Melanson et al. 1998; Matthew 2005; McClain, Sisson et al.

2007). Two GT3X+ models were worn by each participant during the study. One accelerometer

was placed on the midline of the right thigh, one third of the way between the hip and knee and

adhered to the leg with hypoallergenic sticky tape. The other ActiGraph was mounted on the right

hip, at the anterior axillary line, with an elastic belt. The ActiGraph GT3X+ records raw

accelerations of up to ± 6 times the gravitational force (6g) in three axes of movement. For the

current protocol, the accelerometers were set to record data at a rate of 40 samples per second (40

Hz).

GENEA accelerometers

The GENEActiv (Activinsights Ltd, Kimbolton, Cambridgeshire, UK) is a new

accelerometer that has had preliminary validation for PA measurement (Esliger, Rowlands et al.

2011) as well as activity type classification (Zhang, Rowlands et al. 2012). Like the ActiGraph,

the GENEA records raw data of up to ± 6g in three axes of movement. The GENEAs were set to

Page 133: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

118

record acceleration data at a rate of 20 Hz for the current study. Participants wore two GENEA

accelerometers (one on each wrist) for this study. Each GENEA was fastened securely to the

dorsal side of the wrist, between the styloid processes of the radius and ulna (Esliger, Rowlands et

al. 2011).

iPAQ portable digital assistant and direct observation

Direct observation (DO) was conducted using an HP iPAQ PDA (HP Development

Company, Palo Alto, CA) to obtain a criterion measure of activity type for this study. During the

protocol, a trained observer used a portable digital assistant with BEST software developed based

on the Children’s Activity Rating Scale protocol (Puhl, Greaves et al. 1990). The numbers codes

T1-T14 represented the 14 activities in the visit, and the observer recorded the activities being

performed continuously as they occurred throughout the visit. A list of activities and their specific

DO codes can be found in Table 4.1. Inter-rater reliability for DO was above r=0.90 for this study.

Procedure

Each participant reported to the Human Energy Research Laboratory, where details of the

study were discussed with each participant. Written informed consent was obtained, and a

physical activity readiness questionnaire was administered to ensure that the participant was

healthy and had no contraindications to engaging in activity. If participants had answered ‘yes’ to

any question on the questionnaire, they would have been required to obtain physician approval

before being able to participate in the study; however, this did not occur. After consenting to

participation, participant weight and height were taken by trained research assistants according to

standardized methods (Malina 1995). Weight was measured to the nearest 0.1 kg using a Seca

digital scale (Seca, Hanover, Germany), with shoes off and weight balanced on the center of the

Page 134: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

119

scale. Height was measured to the nearest 0.1cm using a Harpenden stadiometer (Holtain Ltd.,

Crymych, United Kingdom). For measurement of height, the participant removed his/her shoes,

stood erect with feet flat on the floor, aligned head in the Frankfurt plane, and placed the back of

the feet, shoulders, and head against the back of the board. Two measurements were taken and

averaged for both weight and height. If the two body weights differed by more than 0.3 kg or if

the two heights differed by more than 0.4 cm, a third measurement was taken, and the closest two

measurements were averaged to obtain a final value. Body mass index (BMI) was calculated by

dividing body weight by the square of height (kg/m2). Age was assessed by asking participants to

state their age in years. Handedness was determined by asking participants which hand they prefer

to use for the majority of activities.

Each participant wore one ActiGraph on the hip, another ActiGraph on the thigh, one

GENEA on the left wrist, and one GENEA on the right wrist while performing 14 activities

(activities shown in Table 4.1). These activities comprised a range of intensities from sedentary to

vigorous and represented a mixture of sedentary, ambulatory, exercise, and lifestyle. Ambulatory

activities (walking and jogging) are common in accelerometer validation literature; we added the

sedentary, exercise, and lifestyle activities to determine the potential for the four accelerometer

placements to accurately measure a range of activity types often seen in free-living settings.

Additionally, we added an activity where participants removed the accelerometers so that the

ANNs would be able to recognize non-wear, which is important to be able to detect in free-living

environments for compliance purposes.

Page 135: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

120

Table 4.1. Activities performed during the simulated free-living protocol.

Activity

Category Activity

Activity

Intensity Description of Activity*

Sedentary

(SE)

Lying down (T1) Sedentary Lying on a mat on the floor

Reading (T2) Sedentary Reading a magazine article while

sitting at a table

Computer (T3) Sedentary Sitting and playing a computer game

that involves mouse clicking and typing

Standing

(ST) Standing (T4) Light** Standing still with arms at sides

Lifestyle

(LI)

Laundry (T5) Light Folding towels and putting them in a

laundry basket

Sweeping (T6) Light Sweeping confetti into piles

Leisure walk

(LW) Walking slow (T7) Light

Walking at a self-selected ‘slow’ pace

in a hallway

Brisk walk

(BW) Walking fast (T8) Moderate

Walking at a self-selected ‘brisk’ pace

in a hallway

Jogging

(JO) Jogging (T9) Vigorous

Jogging at a self-selected pace in a

hallway

Cycling

(CY) Cycling (T10)

Moderate/

Vigorous

Cycling on a cycle ergometer at a self-

selected cadence of 50-100 rpm with 1

kg resistance

Stair use

(SU)

Stair climbing and

descending (T11)

Moderate/

Vigorous

Walking up and down a flight of stairs

at a self-selected pace

Exercise

(EX)

Biceps curls (T12) Light Standing still while doing biceps curls

with a 3-lb. weight in each hand

Squats (T13) Moderate

With feet shoulder-width apart,

bending at the knees (to a 90° angle)

while holding an unweighted broom

behind the head

Non-wear

(NW)

Non-wear of

accelerometer (T14) N/A Not wearing the accelerometer

* Activity order, intensity, and duration (3-10 minutes) were left up to participants.

** Standing has traditionally been considered SB; however, recent literature suggests that standing

should be considered light-intensity instead of SB due to the differential physiologic effects of

standing as compared to sitting/lying (Owen, Healy et al. 2010).

Participants completed a 90-minute, simulated free-living setting which took place in a

laboratory within the Human Energy Research Laboratory as well as a hallway and stairwell.

Page 136: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

121

During the protocol, participants performed the 14 activities listed in Table 4.1. A list of these

activities was given to participants at the beginning of the visit along with a description of how to

perform each activity. Participants completed each of the 14 activities for a total of at least three

minutes and for no more than 10 minutes, but the order, intensity, and duration of the activities

were left up to each participant. A research assistant directly observed and recorded each activity

on a handheld PDA computer while it was being performed. Additionally, activities were written

on a whiteboard and checked off as participants completed each activity. so that participants know

which activities they still needed to complete. Every 4-5 participants, the activities were erased

and rewritten in a different order to avoid possible effects from the order in which the activities

were written. For this study, DO served as the criterion measure of activity type performed. The

non-wear activity was saved until the end of the 90-minute protocol so that participants would not

spend a significant amount of time trying to remove and reattach the accelerometers. Upon

completion of the protocol, participants were given a $35 Target® gift card.

Data reduction and modeling

Artificial neural networks

ANNs are nonlinear models which take a set of inputs x1…xk and use them to predict a

certain output variable y (e.g., EE or activity type), where k is the number of features used to

predict y. Figure 1 provides a graphical depiction of one of the activity type ANNs. For activity

type classification, the ANNs functioned similar to a logistic regression model. Setting the activity

types as the nominal values a1…a14, the ANN model can be seen in Equation 1.

Equation 1: ( ) ( ∑ ( ∑ )

Page 137: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

122

In Equation 1, Pr is probability, C is a constant chosen so that Pr(y=a1)+…+Pr(y=a14)=1, w are

the weights of the input features, U is a logistic activation function, and H is the number of hidden

layers. In accordance with previous research, our models contained only one hidden layer (Preece,

Goulermas et al. 2009; Staudenmayer, Pober et al. 2009; Trost, Wong et al. 2012).

Page 138: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

123

Figure 4.1. ANN for predicting activity type.

Figure 4.1 legend

* The number of output variables shown matches the number used in the study, but the

number of input features varied from 8-38, depending on the feature set tested. Additionally,

three hidden units are shown above for simplicity, but we used 15 hidden units for

constructing our ANNs.

Accelerometer signal features (one of each per axis, three total of each per accelerometer)

1. Mean = mean 2. Var = variance

3. Cov = covariance 4. Min = minimum

5. Max = maximum 6. MeanOR = mean accelerometer orientation

7. VarOR = variance of

accelerometer orientation

8. 10th %ile = 10

th percentile

9. 25th

%ile = 25th percentile 10. 50

th %ile = 50

th percentile

11. 75th

%ile = 75th percentile 12. 90

th %ile = 90

th percentile

Participant characteristics features

13. Ht = participant height 14. Wt = participant weight

Non-feature abbreviations

S = summations of the input layer in the hidden units

U = activation function for the hidden layer

W1 = the weight vectors for each of the inputs

W2 = the weight vectors for each of the summations

Page 139: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

124

The ANNs were created and tested using a leave-one -out approach. In this approach, data

from all but one participant were used to estimate the weights for each input feature for predicting

activity type. Then, the ANN was tested on the data from the participant left out of the training

phase by supplying the input features and comparing the predicted activity type from the ANNs to

the recorded activity type from DO. The leave-one-out cross validation is an iterative approach

and was repeated with each participant’s data used as the testing data once, therefore obtaining an

ANN for activity type for each participant in the study. Weights determined from each iteration of

the leave-one-participation-out validation were averaged to obtain a final ANN for each

accelerometer placement site, r, resulting in four distinct ANNs.

There were two important considerations that were addressed in building our ANNs: 1)

window length and 2) relevant features to use as input variables.

Window length

In order to analyze accelerometer data, it must first be divided into segments, called

‘epochs’ or ‘windows,’ for analysis. By dividing the data into windows, activity type can be

assessed separately for each window to yield information on which activities were being performed

as well as when they were performed. Windows of 60 seconds are commonly used for predicting

energy expenditure while analyzing accelerometer data because summarizing a given energy

expenditure or activity performed each minute is intuitively appealing and works well for steady-

state activities (Staudenmayer, Pober et al. 2009; Freedson, Lyden et al. 2011). Additionally,

longer windows (i.e., 30-60 seconds) increase the amount of information available with which to

determine activity type, and they have been shown to improve activity classification accuracy

(Trost, Wong et al. 2012). However, a significant limitation of longer windows is that they are less

Page 140: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

125

useful in free-living situations, where activities rarely start or end exactly on the minute and where

activities may last less than a minute in length (Lyden 2012). Thus, a 60-second window is likely

to encompass more than one activity, resulting in frequent activity misclassification due to too

much granularity in the output.

On the other hand, very short windows (e.g., less than one second) may not allow enough

time to capture a movement (i.e., in one second a person may only take part of a step when

walking), therefore yielding insufficient information to classify the movement and resulting in

lower classification accuracy (Preece, Goulermas et al. 2009; Trost, Wong et al. 2012). Machine

learning techniques have been conducted with window lengths as short as 0.25 seconds, but many

studies use window lengths of 4-6.7 seconds for classifying activity type (Preece, Goulermas et al.

2009). Therefore, in accordance with previous research and for simplicity, we employed five-

second windows for our activity type in our data processing and analyses.

Features

There are several different types of features that can be used as input variables: time-

domain features, frequency-domain features, and participant characteristics. Time-domain features

are most commonly used because they can be directly computed from the accelerometer signal

data, making them simple to extract and understand (Preece, Goulermas et al. 2009; Staudenmayer,

Pober et al. 2009). Examples of time-domain features are mean, variance, covariance, and

percentiles of the acceleration signal. The other main type of features, frequency-domain features,

can be used either in conjunction with or independent from time-domain features, yielding

similarly high accuracy for activity type classification as time domain features in some studies

(Preece, Goulermas et al. 2009; Mannini, Intille et al. 2013). However, frequency-domain features

Page 141: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

126

require mathematical transformations prior to computation and may require significant

computational power and specialized statistical software (Preece, Goulermas et al. 2009).

Additionally, several studies provide evidence that time-domain features can be used to achieve

high activity classification accuracy (71-99%) from a single accelerometer without use of

frequency-domain features (Herren, Sparti et al. 1999; Staudenmayer, Pober et al. 2009; Dong,

Montoye et al. 2013; Montoye, Dong et al. 2013). Other than time- and frequency-domain

features, simple descriptive features, such as accelerometer orientation or participant demographic

variables, can also be used to improve measurement accuracy.

Many accelerometer signal features have been used in previous research, and the models

created have varied considerably in complexity and measurement accuracy. Adding more features

may improve accuracy of the ANN; however, similarly to linear regression, addition of too many

input variables may lead to overfitting ANNs to the data used for training, resulting in poor

generalizability of the model when applied to a new population. Therefore, there must be a

balance of number of features used and accuracy achieved. Another consideration of adding too

many features is that it increases complexity of the models created and requires more

computational power to create. This added complexity can quickly render machine learning

models difficult to create or use for anyone lacking experience with computer programming and/or

access to expensive computing software (Pober, Staudenmayer et al. 2006; Rothney, Neumann et

al. 2007; Staudenmayer, Pober et al. 2009). Thus, we experimented with different sets of features

to determine a set that had high accuracy of measurement without being overly complex.

Before computing features, the 40 Hz data from the ActiGraph accelerometers were

reintegrated to 20 Hz for comparison with the data from the GENEA. Table 4.2 provides a list of

the 38 features tested and used in the current analyses. The 36 accelerometer features (12 features

Page 142: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

127

for each of three axes) are all time-domain features that have been effectively utilized in previous

studies, and height and weight were included to account for different body sizes. Since five-

second windows were used for activity type classification, there were 100 accelerometer data

points within each five-second window with which to calculate the necessary features (20

samples/second * 5 seconds). Mean, variance, covariance, minimum, maximum, mean and

variance of monitor orientation, and the 10th, 25

th, 50

th, 75

th, and 90

th percentiles of the acceleration

signal were calculated separately for x-, y-, and z-axes. After creating the ANNs using all 38

features, follow-up analyses were conducted to determine if a subset of features could reduce

complexity of the ANNs with minimal loss of accuracy. The subsets tested are shown in Table

4.3. Additionally, feature sets 1 and 2 were tested with and without height and weight included as

input features to determine if including demographic characteristics impacted classification

accuracy.

Page 143: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

128

Table 4.2. Features used for EE and activity type prediction.

Feature

number

Feature used Formula for calculating feature

1-3* Mean acceleration signal (

)

4-6* Variance of acceleration signal

∑ ( )

7-9* Covariance of acceleration

signal ∑ (

) ( ( ) )]

10-12* Minimum of acceleration signal ( )

13-15* Maximum of acceleration

signal ( )

16-18* 10th percentile of acceleration

signal For every 100 accelerations, arrange in order from smallest to largest and pick the 10

th

value

19-21* 25th percentile of acceleration

signal For every 100 accelerations, arrange in order from smallest to largest and pick the 25

th

value

22-24* 50th percentile of acceleration

signal For every 100 accelerations, arrange in order from smallest to largest and pick the 50

th

value

25-27* 75th percentile of acceleration

signal For every 100 accelerations, arrange in

order from smallest to largest and pick the 75th

value

28-30* 90th percentile of acceleration

signal For every 100 accelerations, arrange in

order from smallest to largest and pick the 10th

value

N/A Accelerometer orientation

(needed for calculating features

31-36) ( )

(

√(

)

)

31-33* Mean accelerometer orientation (

)

34-36* Variance of accelerometer

orientation ∑ ( )

37 Participant height

N/A

38 Participant weight N/A

* Signifies that one feature is included for each of the three accelerometer axes. The formulas

shown are for the x-axis, but the formulas for the y-and z-axes are similar. Ax is the acceleration in

the direction of the x-axis.

Page 144: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

129

Table 4.3. Feature sets used for creation and testing of ANNs.

Feature set number Features used Total number of features used

1 Mean, variance, covariance,

minimum, maximum, mean

orientation, variance of orientation,

and 10th, 25

th, 50

th, 75

th, and 90

th

percentiles of acceleration signal,

weight, and height

38 (12 accelerometer

features/axis * 3 axes + weight

+ height)

2 Mean and variance of acceleration

signal, weight, and height

8 (2 accelerometer features/axis

* 3 axes + weight + height)

3 Mean, variance, minimum, and

maximum of acceleration signal,

weight, and height

14 (4 accelerometer

features/axis * 3 axes + weight

+ height)

4 Mean, variance, covariance,

minimum, and maximum of

acceleration signal, weight, and

height

17 (5 accelerometer

features/axis * 3 axes + weight

+ height)

5 10th

, 25th, 50

th, 75

th, and 90

th

percentiles of acceleration signal,

weight, and height

17 (5 accelerometer

features/axis * 3 axes + weight

+ height)

Activity type classification

Although 14 activities were performed in the protocol, some activities could be combined

into common groupings. Differentiating among the sitting activities (computer use and reading)

and lying may be difficult using a single accelerometer on the thigh or hip since thigh movement

was minimal and thigh and hip orientation were similar for all three activities. However, these

sedentary activities elicit similar physiologic responses (Bey and Hamilton 2003), so

differentiation among them was not of central importance in this study. Therefore, these three

activities were grouped into a ‘sedentary’ category. Conversely, standing, which is considered SB

in most studies since its energy cost is less than 1.5 METs (Ainsworth, Haskell et al. 2011),

Page 145: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

130

requires significant postural muscle contraction and elicits different physiologic responses from

sitting and lying down (Hamilton, Hamilton et al. 2004; Hamilton, Hamilton et al. 2007). Thus,

standing had its own category, separate from the sedentary category. Squats and biceps curls are

both exercise activities, so they were grouped into an ‘exercise’ category. Finally, laundry and

sweeping are both lifestyle activities that are light intensity and involve intermittent movement of

both the upper and lower body. Thus, these were combined into the ‘lifestyle’ category. The rest

of the activities had their own categories. In summary, we evaluated activity classification

accuracy for all 14 activities and then for the 10 categories. These 10 categories are displayed in

the leftmost column of Table 4.1. It is important to note that these categories were not meant to

imply that certain types of activities could not be in a different category (i.e., walking and running

are often used for exercise rather than ambulation). Rather, these categories were developed to

group similar activities to offer a better idea of the utility of the ANNs for activity classification

accuracy. Additionally, a third grouping was performed by grouping activities into intensity

categories (sedentary, light, moderate, vigorous) in order to determine how well the ANNs can

predict the relative intensity of an activity.

Identifying non-wear

Non-wear was classified as a separate activity type from the 13 other activities performed

by the participants in the 90-minute free-living simulation. By creating a distinct category and

training the ANNs to recognize non-wear, we hoped to eliminate the need to establish coding rules

for how many minutes of consecutive zero counts determine non-wear when accelerometers are

worn in free-living settings (Masse, Fuemmeler et al. 2005; Evenson and Terry 2009).

Page 146: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

131

Direct observation

With the BEST software, DO data were recorded instantaneously and were not reintegrated

into predefined windows for analysis. Therefore, within each five-second window of

accelerometer data, there were one or more activities performed. When participants transitioned

between activities, usually the activity transition occurred in the middle of a five-second window

(as opposed to perfectly at the end of one window/start of another), meaning that it was not

possible to accurately predict the activity performed in the transition. Therefore, in each five-

second window in which a transition between activities occurred, the window was removed from

the data set before training and testing the activity type ANNs. The removal of transition windows

was necessary for validation purposes; when implemented in a free-living setting, transitions

between activities and multiple activities performed in a single window cannot be classified

correctly since predictive models can only predict one activity for a window. To minimize this

issue, we used five-second windows instead of the longer windows used in many previous studies

(Rothney, Neumann et al. 2007; Preece, Goulermas et al. 2009; Staudenmayer, Pober et al. 2009).

Statistical analyses

Classification accuracies were determined by calculating the sensitivity, specificity, and

area under the curve (AUC) for each ANN. Operational definitions of these three variables follow.

Sensitivity: Sensitivity refers to the ability of each ANN to correctly classify an activity when it

occurs (Parikh, Mathai et al. 2008). It represents the proportion of times an activity was predicted

when it actually occurred. Percent agreement, which is equivalent to sensitivity, is most often

reported in the literature for defining classification accuracy and was our primary measure of

classification accuracy in this study.

Page 147: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

132

Specificity: Specificity refers to the ability of the each ANN to correctly classify an activity as not

occurring when it does not occur. It represents the proportion of times an activity was not

predicted when the activity, in fact, did not occur (Parikh, Mathai et al. 2008).

Area under the curve: AUC is the area under the receiver operating characteristic curve created by

graphing sensitivity of a variable on the y axis and 1-specificity on the x axis. A value of 1.00

represents perfect classification accuracy and a value of 0.50 represents accuracy which is no better

than what would be attained from chance alone. According to Metz, AUC values of ≥ 0.90 are

considered excellent, 0.80-0.89 are good, 0.70-0.79 are fair, and <0.70 are considered poor

classification accuracy (Metz 1978).

Sensitivities, specificities, and AUC were calculated from each accelerometer and each

iteration of the leave-one-out validation. Differences among the hip, wrists, and thigh

accelerometers were evaluated using repeated measures analysis of variance (RMANOVA). If the

RMANOVA test statistic was significant, post hoc tests were conducted using dependent-samples

t-test and a least significant difference (LSD) correction in order to account for multiple

comparisons and avoid inflation of type I error. Additionally, RMANOVA was used to compare

classification accuracies among different feature sets. The a priori Alpha level was set at P<0.05.

After running primary analyses for the left- and right-wrist accelerometer placements, the data set

was rearranged to compare dominant vs. non-dominant wrist placements, and dependent-samples

t-tests were run to compare overall sensitivities for the dominant vs. non-dominant wrist.

Confusion matrices were created for each of the four accelerometer placements, with the

actual activity performed as the rows of each matrix and the predicted activity as the columns of

each matrix.

Page 148: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

133

Power analysis

We desired 80% power to detect a difference of at least moderate effect size (ES=0.5)

among accuracies of accelerometers compared to the criterion measure. Therefore, with the α level

set at α = 0.05, we needed 34 participants to be sufficiently powered to detect a moderate effect

size difference among groups. We chose to oversample by 10 participants in order to have

adequate sample size despite an expected loss of a few participants due to the possibility of

equipment malfunction, especially when using multiple accelerometers, a handheld computer, and

a portable metabolic analyzer (used for a different aim of the study).

Page 149: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

134

RESULTS

Data were collected from 44 participants for the current study. However, significant data

loss from the accelerometers occurred in two participants, and there was an Oxycon portable

metabolic analyzer malfunction in three other participants which resulted in premature

termination of data collection. The remaining 39 participants who completed the 90-minute

protocol and had usable data were included in analyses (shown in Table 4.4). Those excluded

from the analysis were not statistically different from those included in terms of demographic

characteristics.

Table 4.4. Demographic characteristics of participants enrolled in study.

All (n=39) Males (n=19) Females (n=20)

Age (years) 22.1 (4.3) 23.7 (5.0) 20.5 (2.7)

Weight (kg) 72.4 (16.2) 84.5 (13.1) 60.8 (8.9)

Height (cm) 171.4 (10.1) 179.1 (7.7) 164.1 (5.7)

BMI (kg/m2) 24.4 (3.6) 26.3 (3.4) 22.5 (2.6)

Data are displayed as mean (SD).

As most studies present classification accuracy in terms of sensitivity only, we present

the first part of our analysis in terms of sensitivity. Sensitivities for each accelerometer

placement are shown in Figure 4.2. The sensitivities were as high as 80.9% and 81.1% for the

left and right wrist accelerometer placements, respectively, with feature set 1. Both wrist

placements had significantly higher sensitivities than the thigh or hip placements, and this

difference existed for all five sets of features tested. Additionally, the thigh placement had

significantly higher sensitivity than the hip placement for all five feature sets. For all five sets of

features, the two wrist placements achieved similar overall sensitivities. Finally, feature sets 1

and 2 were modified to exclude height and weight as input features to determine if these

Page 150: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

135

demographic characteristics affected classification accuracy. For both feature sets, classification

accuracies were unchanged by excluding height and weight as predictor variables.

Figure 4.2 also shows comparisons of classification accuracies achieved among the five

feature sets. For all four accelerometers, feature set 1 provided the highest sensitivity, and feature

set 2 provided the lowest sensitivity. Additionally, the ANNs created with feature sets 4 and 5

provided sensitivities similar to that achieved using feature set 1, but improvements from feature

set 2 were no longer statistically significant for the wrist accelerometers. The ANNs created

from feature set 3 yielded similar sensitivities to feature set 1 for the thigh and both wrist

accelerometers but significantly lower sensitivity with the hip accelerometer. Additionally,

inclusion and exclusion of height and weight as input variables had no effect on classification

accuracy. Due to the superiority of feature set 1 compared to the other feature sets, further

analyses were performed using feature set 1.

Page 151: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

136

Figure 4.2. Sensitivity for the four accelerometer placements, compared among feature sets.

The * indicates significant differences from all other accelerometer placement sites.

The † indicates significant differences from feature set 1 (all 38 features).

The ^ indicates significant differences from feature set 2.

Table 4.5 provides a comparison of the sensitivity, specificity, and AUC of each

accelerometer placement using feature set 1. All three measures were significantly higher for

the two wrist-mounted accelerometers than the thigh or hip accelerometer placements, and all

three were also significantly higher for the thigh than the hip placement. The magnitude of

differences was much larger for sensitivity than specificity, which was consistently high across

all accelerometer placements. With AUC values of 0.90, both wrists achieved excellent

classification accuracy; in contrast, the hip and thigh placement sites achieved good

classification accuracy with AUC values of 0.82 and 0.84, respectively .

0.0

10.0

20.0

30.0

40.0

50.0

60.0

70.0

80.0

90.0

100.0

1 2 3 4 5

Sen

siti

vity

(%

)

Feature Set

ActiGraph Hip

ActiGraph Thigh

GENEA L. Wrist

GENEA R. Wrist

* * * * * * * * *

*

*

† † ^ ^ ^ ^

Page 152: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

137

Table 4.5. Overall sensitivity, specificity, and AUC for each of the four accelerometer

placements for feature set 1.

ActiGraph

Hip

ActiGraph Thigh GENEA Left Wrist GENEA Right Wrist

Sensitivity

(%)

66.4

(65.9-66.8)

71.7

(71.3-72.1)*

81.3

(81.0-81.7)^

81.4

81.1-81.8)^

Specificity

(%)

97.4

(97.2-97.5)

97.8

(97.7-97.9)*

98.5

(98.4-98.6)^

98.5

(98.4-98.7)^

AUC 0.82

(0.80-0.84)

0.84

(0.83-0.86)*

0.90

(0.89-0.91)^

0.90

(0.88-0.91)^

Values are reported as mean (95% CI).

The * indicates significant differences from the hip accelerometer placement.

The ^ indicates significant differences form the hip and thigh accelerometer placementss.

Confusion matrices

The confusion matrices for each of the four accelerometer placement sites (using the

ANN created using feature set 1) can be found in Tables 4.6-4.9. The rows of each confusion

matrix are the actual activities performed, and the columns in each matrix represent the activities

predicted by the ANN. In Tables 4.6-4.9, the “Total” column represents the total number of five-

second windows of data recorded for each activity, combined for all 39 participants. The cells

highlighted in gray represent the number of windows correctly classified for each activity. Table

4.6 shows the overall sensitivity, specificity, and AUC values (calculated from the data in the

confusion matrices) across all 14 activities for each of the four accelerometer placements.

Overall AUC was 0.90 for each of the wrist accelerometer placements, indicating excellent

classification accuracy according to parameters suggested by Metz (Metz 1978). The hip and

thigh placements achieved AUC values, of 0.82 and 0.85, respectively, indicating good

classification accuracy. To calculate sensitivity for a specific activity, we divided the number of

correctly classified windows by the total number of windows in which that activity was

Page 153: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

138

performed. For example, in Table 4.6, lying was correctly classified by the hip accelerometer

2,677 out of the 2,948 windows when lying took place, resulting in a sensitivity of 90.8%. For

an example of specificity, activities other than lying were performed for a total of 39,277

windows for the hip placement (Table 4.6). Of these, the hip accelerometer ANN predicted lying

as the activity performed only 179 times, resulting in a specificity of (39,277-179)/39,277 =

0.995. The AUC values were calculated, based on the sensitivity and specificity values, using

Microsoft Excel.

A significant advantage of displaying data with a confusion matrix is that the matrix

allows one to assess misclassification to determine potential weaknesses of activity classification

from each accelerometer placement site and identify the types of activities for which each site

has the highest classification accuracy. From the confusion matrices, it is apparent that the thigh

placement performed best for the fast walk and stairs (although the hip was within 2%). At

77.1% sensitivity, the hip accelerometer placement site was best for the slow walk; conversely,

the wrist sites were best for jogging, although all four placement sites correctly recognized

jogging greater than 90% of the time.

For the exercise activities, the wrist accelerometer placements achieved sensitivities close

to 90%. The thigh had similar sensitivity for squats but much lower sensitivity for classifying

biceps curls (53.4%). Similarly, for the lifestyle activities, the wrist placements outperformed the

hip and thigh placements, achieving sensitivities close to 80% for each activity (with the hip at

50-60% and the thigh at 59-73%). Moreover, two of the three sedentary activities (reading and

computer use) were least likely to be detected by the hip placement (Table 4.6) and most likely

to be detected by the two wrist placements (Tables 4.8 and 4.9), although sensitivity for

recognizing lying down was highest with the hip (90.8%).

Page 154: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

139

With the hip and thigh accelerometer placements, reading and computer use had low

sensitivities (36-55%) due to frequent misclassification of one activity as the other.

Additionally, the hip accelerometer ANN often misclassified these activities as standing (13-23%

of the time), while the thigh accelerometer ANN incorrectly predicted these activities as lying

down 12.7-19.3% of the time. For sweeping and laundry (the lifestyle activities), the hip and

thigh ANNs often misclassified one as the other (7-16% of the time) or as biceps curls (15-22%

of the time). Additionally, activities that took place while standing with minimal movement

(standing, biceps curls) were often classified incorrectly by the hip and thigh accelerometer

placements, with one often mistaken as the other. Cycling was not well-recognized by the hip as

it was often misclassified as a lifestyle activity (8.0% for laundry and 14.6% for sweeping);

conversely, cycling was detected with >84% sensitivity with the other three accelerometer

placements. Finally, all four accelerometers had trouble distinguishing between the two walking

speeds, frequently misclassifying one as the other or as stairs (9-14% of the time).

Activity categories

Upon further examinations of the four confusion matrices (Tables 4.6-4.9), it was

apparent that classification accuracies were lowest among the sedentary activities for the hip and

thigh accelerometer ANNs, with frequent misclassification of one sedentary activity with another

sedentary activity (i.e., reading as computer use or vice versa). However, for our purposes, it

was less critical to be able to differentiate among sedentary activities than it was to be able to

correctly identify when a sedentary activity occurred (vs. a non-sedentary activity); therefore, we

performed follow-up analyses combining lying, reading, and computer use into a ‘sedentary’

category. Similarly, we combined laundry and sweeping into a ‘lifestyle’ category and squats

and biceps curls into an ‘exercise category’, therefore leaving 10 activity categories. We chose

Page 155: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

140

not to group the two walking speeds since they represent different intensities of movement (i.e.,

light vs. moderate) and may have different health implications.

Instead of displaying four additional confusion matrices, we summarized overall

classification accuracies for each accelerometer placement in Table 4.11. As can be seen, overall

AUC values improved for all four accelerometer sites when combining similar activities into

categories, with the largest improvement seen for the thigh accelerometer placement (AUC of

0.91). Additionally, when looking at overall sensitivity for classification, all four accelerometer

placement sites saw increased sensitivity, with the largest improvement, 12.6% (71.4% to

84.0%), seen in the thigh placement. The hip ANN accuracy improved 6.1%, and wrist

placements’ accuracies improved between 5.6-6.3%.

After combining into 10 categories, sedentary activities were still classified with lowest

sensitivity with the hip accelerometer placement (72.6%), although the sensitivity achieved with

the thigh placement (92.1%) approached that achieved by the wrist-mounted accelerometers

(92.7-93.5%). Standing was classified with much lower sensitivity by the hip and thigh

placements (56.9-69.6%) than the wrist placements (90.0-90.2%) due to frequent

misclassification with biceps curls (exercise category). Also, both the exercise and lifestyle

activities were best classified by the wrist placement sites (89.8-90.5% and 88.9-90.1%)

compared to the hip (68.5% and 60.6%) and thigh (83.1% and 70.3%) accelerometer placements.

Furthermore, jogging was classified with over 90% sensitivity for all placement sites but was

slightly better with the wrists (95.3-96.1%) than the hip (92.5%) or thigh (93.1%). Finally, non-

wear was detected with over 80% sensitivity for all placements but was highest among the wrist

sites (88.8-91.3%). Therefore, the two wrist accelerometer placements appeared to provide

Page 156: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

141

superior classification accuracies overall as well as for many of the specific activities and activity

categories.

Activity intensity categories

Lastly, we combined the 14 activities according to their intensity in order to determine

how well each accelerometer placement site could correctly classify activity intensity. The MET

values elicited by each activity are shown in Table 4.12, both as estimated from the

Compendium of Physical Activities and from the average METs measured by the portable

metabolic analyzer during the visit. Activities that were seated or lying and required less than

1.5 METs were classified as sedentary (SBRN 2012). Activities with an intensity between 1.5

and 2.9 METs were classified as light, an intensity between 3.0-5.9 METs were considered

moderate, and an intensity at or above 6.0 METs were considered vigorous (PAGAC 2008).

Both methods for intensity classification resulted in the same intensity categorization for each

activity, yielding three sedentary activities, five light-intensity activities, three moderate-intensity

activities, and two vigorous-intensity activities (non-wear was not included in an intensity

category). Sensitivity, specificity, and AUC for correct classification of activity intensity can be

seen in Table 4.13. Overall sensitivities increased for all four accelerometer placements

compared to sensitivities achieved when classifying individual activities or activity categories.

Additionally, sensitivity increased the most in the thigh placement, surpassing the sensitivities

achieved by the wrist placement sites. Specificity dropped slightly for all placements, resulting

in no change in AUC for the hip or wrist placements compared to that achieved for classification

of 10 activity categories. However, the AUC for the thigh placement increased to 0.94 and was

significantly higher than that achieved by the wrist placements.

Page 157: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

142

Examining differences between the left and right wrists for classification accuracy

yielded only small differences. The right wrist classified lying and computer use better than the

left wrist, whereas the left wrist had a higher classification accuracy for reading than the right

wrist; however, when grouped into a sedentary category, classification accuracies were less than

1% different between the wrist monitors (93.5% for left wrist and 92.7% for right wrist). In fact,

the only activities with more than a 1% difference in classification accuracies between wrists

were cycling (3.1% higher for left wrist), exercise activities (1.2% higher for right wrist), stairs

(1.1% better for right wrist), and non-wear (2.5% better for right wrist).

Since four of the 39 participants included in the analyses reported being left-hand

dominant, we also analyzed overall classification accuracy between the dominant and non-

dominant wrist accelerometer placements (as opposed to strictly comparing left and right wrists).

As can be seen in Figure 4.3, no significant differences existed in overall classification accuracy

exist between the dominant and non-dominant wrist placements for any of the five feature sets.

Page 158: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

143

Table 4.6. Confusion matrix for activity type classification from a hip-mounted ActiGraph accelerometer.

AG

Hip LY RE CO ST LA SW WS WF JO CY BC SQ SR NW Total

LY 2677 17 1 11 70 3 2 1 0 6 53 0 1 106 2948

RE 21 1215 1055 441 108 20 4 0 0 45 149 3 1 270 3332

CO 2 729 1354 795 102 11 5 0 0 14 179 0 1 265 3457

ST 0 273 429 1447 44 12 2 0 0 21 238 0 0 75 2541

LA 111 27 38 21 1726 425 30 1 2 310 612 100 2 6 3411

SW 2 11 4 9 395 1473 34 0 0 404 32 79 12 0 2455

WS 0 3 0 0 1 49 2057 298 20 284 0 13 349 0 3074

WF 0 3 0 0 3 7 249 2228 13 4 4 19 360 0 2890

JO 0 0 0 0 25 3 6 26 2223 0 0 13 105 2 2403

CY 1 8 3 0 239 435 125 6 0 1970 170 2 27 1 2987

BC 3 220 133 269 585 24 7 0 0 153 1139 9 2 0 2544

SQ 0 1 11 1 67 97 53 0 0 126 7 1664 81 1 2109

SR 0 0 0 0 1 9 241 114 68 8 0 14 3116 0 3571

NW 39 527 143 85 10 4 0 0 2 4 6 0 0 3683 4503

The “Total” column is the total number of five-second intervals in which each activity was performed (data from all 39 participants).

Rows are actual activities performed, and columns are predicted activities.

LY = Lying, RE = Reading, CO = Computer, ST = Standing, LA = Laundry, SW = Sweeping, WS = Walk slow, WF = Walk fast, JO

= Jogging, CY = Cycling, BC = Biceps curls, SQ = Squats, SR = Stairs, NW = Non-wear.

Page 159: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

144

Table 4.7. Confusion matrix for activity type classification from a thigh-mounted ActiGraph accelerometer.

AG

Thigh LY RE CO ST LA SW WS WF JO CY BC SQ SR NW Total

LY 1590 711 310 0 3 2 1 0 0 8 63 3 5 252 2948

RE 643 1281 1224 0 40 5 2 0 0 6 0 3 2 126 3332

CO 440 884 1889 0 32 0 4 0 0 4 0 1 2 201 3457

ST 3 26 28 1768 149 13 4 0 0 1 548 0 1 0 2541

LA 18 71 102 54 2024 528 33 1 0 46 518 15 1 0 3411

SW 1 4 5 9 534 1786 42 0 0 53 8 2 11 0 2455

WS 1 0 0 0 16 64 2527 322 10 80 2 0 52 0 3074

WF 0 0 0 0 9 2 366 2078 43 80 7 7 298 0 2890

JO 0 0 0 0 1 3 44 27 2238 20 0 0 70 0 2403

CY 50 29 2 0 116 9 12 1 9 2694 4 9 50 2 2987

BC 0 16 13 409 680 20 7 0 0 18 1359 22 0 0 2544

SQ 0 2 0 2 44 30 75 5 0 36 15 1875 21 4 2109

SR 0 0 0 0 4 3 137 104 57 83 0 1 3182 0 3571

NW 566 7 35 3 8 0 0 0 1 2 4 3 1 3873 4503

The “Total” column is the total number of five-second intervals in which each activity was performed (data from all 39 participants).

Rows are actual activities performed, and columns are predicted activities.

LY = Lying, RE = Reading, CO = Computer, ST = Standing, LA = Laundry, SW = Sweeping, WS = Walk slow, WF = Walk fast, JO

= Jogging, CY = Cycling, BC = Biceps curls, SQ = Squats, SR = Stairs, NW = Non-wear.

Page 160: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

145

Table 4.8. Confusion matrix for activity type classification for a GENEA accelerometer mounted on the left wrist.

GE Left

Wrist LY RE CO ST LA SW WS WF JO CY BC SQ SR NW Total

LY 2246 461 86 1 23 1 2 1 0 13 64 5 0 45 2948

RE 406 2228 381 31 100 36 3 0 0 80 6 29 3 29 3332

CO 199 374 2722 6 15 16 3 0 0 91 4 11 2 14 3457

ST 5 53 7 2291 38 33 15 1 0 9 22 3 0 64 2541

LA 91 83 4 25 2857 225 22 2 1 23 23 6 49 0 3411

SW 3 24 3 17 273 1952 36 22 0 16 5 2 97 5 2455

WS 7 5 1 18 56 31 2198 354 11 11 15 4 363 0 3074

WF 2 0 0 3 26 17 380 2019 1 1 18 9 412 2 2890

JO 0 0 0 2 11 2 12 10 2291 0 0 0 75 0 2403

CY 8 55 184 15 51 32 9 1 0 2606 10 2 13 1 2987

BC 89 18 7 29 77 10 10 51 0 14 2225 8 2 4 2544

SQ 4 53 39 1 32 4 6 10 0 51 11 1894 4 0 2109

SR 2 0 0 1 95 109 273 300 147 2 8 15 2619 0 3571

NW 85 5 70 326 7 0 1 0 0 0 8 0 1 4000 4503

The “Total” column is the total number of five-second intervals in which each activity was performed (data from all 39 participants).

Rows are actual activities performed, and columns are predicted activities.

LY = Lying, RE = Reading, CO = Computer, ST = Standing, LA = Laundry, SW = Sweeping, WS = Walk slow, WF = Walk fast, JO

= Jogging, CY = Cycling, BC = Biceps curls, SQ = Squats, SR = Stairs, NW = Non-wear.

Page 161: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

146

Table 4.9. Confusion matrix for activity type classification for a GENEA accelerometer mounted on the right wrist.

GE

Right

Wrist

LY RE CO ST LA SW WS WF JO CY BC SQ SR NW Total

LY 2391 398 38 1 34 2 1 0 0 17 62 1 3 0 2948

RE 519 1995 406 63 98 42 4 0 0 102 7 17 1 78 3332

CO 101 210 2971 0 14 7 1 0 0 142 0 0 1 10 3457

ST 9 105 4 2287 23 25 25 2 0 7 11 4 2 37 2541

LA 87 78 2 16 2705 321 41 6 0 43 27 8 77 0 3411

SW 14 12 0 14 342 1902 54 0 2 27 11 6 71 0 2455

WS 3 4 0 11 67 30 2219 352 9 10 20 81 268 0 3074

WF 0 0 0 1 32 24 406 2006 0 1 14 15 391 0 2890

JO 0 0 0 3 3 0 5 18 2310 0 3 0 61 0 2403

CY 51 72 195 2 79 38 11 3 0 2513 7 1 14 1 2987

BC 26 1 6 36 61 17 29 20 0 10 2310 17 10 1 2544

SQ 1 20 1 2 24 18 57 7 1 66 13 1854 44 1 2109

SR 6 1 0 0 119 96 266 271 112 4 15 25 2656 0 3571

NW 114 88 13 168 4 2 1 0 0 1 1 0 0 4111 4503

The “Total” column is the total number of five-second intervals in which each activity was performed (data from all 39 participants).

Rows are actual activities performed, and columns are predicted activities.

LY = Lying, RE = Reading, CO = Computer, ST = Standing, LA = Laundry, SW = Sweeping, WS = Walk slow, WF = Walk fast, JO

= Jogging, CY = Cycling, BC = Biceps curls, SQ = Squats, SR = Stairs, NW = Non-wear.

Page 162: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

147

Table 4.10. Activity-specific sensitivity, specificity, and AUC among the four accelerometer placement sites.

Sensitivity (% agreement) Specificity (%) AUC

AG

Hip

AG

Thigh

GE

Left

Wrist

GE

Right

Wrist

AG

Hip

AG

Thigh

GE

Left

Wrist

GE

Right

Wrist

AG

Hip

AG

Thigh

GE

Left

Wrist

GE

Right

Wrist

LY 90.8

(3.3)*

53.9

(5.7)*

76.2

(4.9)*

81.1

(4.5)*

99.5

(0.8)*

95.6

(2.4)*

97.7

(1.7)

97.6

(1.8)

0.95

(0.01)*

0.75

(0.01)*

0.87

(0.01)*

0.89

(0.01)*

RE 36.5

(5.2)*

38.4

(5.3)*

66.9

(5.1)*

59.9

(5.3)*

95.3

(2.3)

95.5

(2.2)

97.1

(1.8)^

97.5

(1.7)^

0.66

(0.01)*

0.67

(0.01)*

0.82

(0.01)*

0.79

(0.01)*

CO 39.2

(5.2)*

54.6

(5.3)*

78.7

(4.3)*

85.9

(3.7)*

95.3

(2.2)

95.6

(2.2)

98.0

(1.5)^

98.3

(1.4)^

0.67

(0.01)*

0.75

(0.01)*

0.88

(0.01)*

0.92

(0.01)*

ST 56.9

(6.1)*

69.6

(5.7)*

90.2

(3.7)

90.0

(3.7)

95.9

(2.5)*

98.8

(1.3)

98.8

(1.3)

99.2

(1.1)*

0.76

(0.01)*

0.84

(0.01)*

0.94

(0.01)*

0.95

(0.01)*

LA 50.6

(5.3)*

59.3

(5.3)*

83.8

(3.9)*

79.3

(4.3)*

95.7

(2.2)

95.8

(2.1)

97.9

(1.5)

97.7

(1.6)

0.73

(0.01)*

0.78

(0.01)*

0.91

(0.01)*

0.88

(0.01)*

SW 60.0

(6.2)*

72.7

(5.6)*

79.5

(5.1)*

77.5

(5.3)*

97.2

(2.1)*

98.3

(1.6)

98.7

(1.4)

98.4

(1.6)

0.79

(0.01)*

0.86

(0.01)*

0.89

(0.01)*

0.88

(0.01)*

WS 66.9

(5.3)*

82.2

(4.3)*

71.5

(5.1)

72.2

(5.0)

98.1

(1.5)

98.1

(1.5)

98.0

(1.6)

97.7

(1.7)

0.82

(0.01)*

0.90

(0.01)*

0.85

(0.01)

0.85

(0.01)

WF 77.1

(4.9)*

71.9

(5.2)*

69.9

(5.3)

69.4

(5.4)

98.9

(1.2)

98.8

(1.3)

98.1

(1.6)

98.3

(1.5)

0.88

(0.01)*

0.85

(0.01)*

0.84

(0.01)

0.84

(0.01)

JO 92.5

(3.4)

93.1

(3.2)

95.3

(2.7)^

96.1

(2.5)^

99.7

(0.7)

99.7

(0.7)

99.6

(0.8)

99.7

(0.7)

0.96

(0.01)

0.96

(0.01)

0.97

(0.01)*

0.98

(0.01)*

CY 66.0

(5.4)*

90.2

(3.4)*

87.2

(3.8)*

84.1

(4.2)*

96.5

(2.1)*

98.9

(1.2)

99.2

(1.0)

98.9

(1.2)

0.81

(0.01)*

0.95

(0.01)*

0.93

(0.01)*

0.92

(0.01)*

BC 44.8

(6.2)*

53.4

(6.2)*

87.5

(4.1)*

90.8

(3.6)*

96.3

(2.3)*

97.1

(2.1)

99.5

(0.9)

99.5

(0.9)

0.71

(0.01)*

0.75

(0.01)*

0.93

(0.01)*

0.95

(0.01)*

SQ 78.9

(5.5)*

88.9

(4.3)

89.8

(4.1)

87.9

(4.4)

99.4

(1.1)

99.8

(0.6)

99.8

(0.6)

99.6

(0.9)

0.89

(0.01)*

0.94

(0.01)

0.95

(0.01)*

0.94

(0.01)

SR 87.3

(3.5)*

89.1

(3.3)*

73.3

(4.6)

74.4

(4.6)

97.6

(1.6)

98.7

(1.2)*

97.4

(1.7)

97.6

(1.6)

0.92

(0.01)*

0.94

(0.01)*

0.85

(0.01)*

0.86

(0.01)*

Page 163: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

148

Table 4.10 (cont’d.)

NW 81.8

(3.6)*

86.0

(3.2)*

88.8

(2.9)*

91.3

(2.6)*

98.1

(1.3)

98.4

(1.2)

99.6

(0.6)^

99.7

(0.5)^

0.90

(0.01)*

0.92

(0.01)*

0.94

(0.01)*

0.95

(0.01)*

Total 66.2

(1.4)*

71.4

(1.4)*

80.9

(1.2)

81.1

(1.2)

97.4

(0.5)*

97.8

(0.4)*

98.5

(0.4)

98.5

(0.4)

0.82

(0.00)*

0.85

(0.00)*

0.90

(0.00)

0.90

(0.00)

Values are shown as Mean (SD). The * indicates significant differences from all other accelerometer placements. The ^ indicates

significant differences from the hip and thigh accelerometers.

LY = Lying, RE = Reading, CO = Computer, ST = Standing, LA = Laundry, SW = Sweeping, WS = Walk slow, WF = Walk fast, JO

= Jogging, CY = Cycling, BC = Biceps curls, SQ = Squats, SR = Stairs, NW = Non-wear.

AG Hip = Hip-mounted ActiGraph monitor, AG Thigh = Thigh-mounted ActiGraph monitor, GE Left Wrist = GENEA monitor

placed on the left wrist, GE Right Wrist = GENEA monitor placed on the right wrist.

Page 164: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

149

Table 4.11. Overall sensitivity, specificity, and AUC among the four accelerometer placement sites with combined activity categories.

Sensitivity (% agreement) Specificity (%) AUC

AG

Hip

AG

Thigh

GE

Left

Wrist

GE

Right

Wrist

AG

Hip

AG

Thigh

GE

Left

Wrist

GE

Right

Wrist

AG

Hip

AG

Thigh

GE

Left

Wrist

GE

Right

Wrist

SE 72.6

(2.8)*

92.1

(1.7)*

93.5

(1.6)*

92.7

(1.6)*

93.9

(1.5)*

97.0

(1.1)

97.2

(1.0)

97.2

(1.0)

0.83

(0.01)*

0.95

(0.01)

0.95

(0.01)

0.95

(0.01)

ST 56.9

(6.1)*

69.6

(5.7)*

90.2

(3.7)

90.0

(3.7)

95.9

(2.5)*

98.8

(1.4)

98.8

(1.3)

99.2

(1.1)*

0.76

(0.01)*

0.84

(0.01)*

0.94

(0.01)*

0.95

(0.01)*

LI 68.5

(3.8)*

83.1

(3.1)*

90.5

(2.4)

89.8

(2.5)

94.7

(1.8)*

96.6

(1.5)*

97.7

(1.2)

97.6

(1.2)

0.82

(0.01)*

0.90

(0.01)*

0.94

(0.01)

0.94

(0.01)

WS 66.9

(5.3)*

82.2

(4.3)*

71.5

(5.1)

72.2

(5.0)

98.1

(1.6)

98.1

(1.5)

98.0

(1.6)

97.7

(1.7)

0.82

(0.01)*

0.90

(0.01)*

0.85

(0.01)

0.85

(0.01)

WF 77.1

(4.9)*

71.9

(5.2)*

69.9

(5.3)

69.4

(5.4)

98.9

(1.2)

98.8

(1.2)

98.1

(1.6)

98.3

(1.5)

0.88

(0.01)*

0.85

(0.01)*

0.84

(0.01)

0.84

(0.01)

JO 92.5

(3.4)

93.1

(3.2)

95.3

(2.7)^

96.1

(2.5)^

99.7

(0.7)

99.7

(0.7)

99.6

(0.8)

99.7

(0.7)

0.96

(0.01)

0.96

(0.01)

0.97

(0.01)*

0.98

(0.01)*

CY 66.0

(5.4)*

90.2

(3.4)*

87.2

(3.8)*

84.1

(4.2)*

96.5

(2.1)*

98.9

(1.2)

99.2

(1.0)

98.9

(1.2)

0.81

(0.01)*

0.95

(0.01)*

0.93

(0.01)*

0.92

(0.01)*

EX 60.6

(4.5)*

70.3

(4.2)*

88.9

(2.9)*

90.1

(2.7)*

94.7

(2.0)*

95.7

(1.8)*

97.3

(1.5)

97.4

(1.4)

0.78

(0.01)*

0.83

(0.01)*

0.93

(0.01)*

0.94

(0.01)*

SR 87.3

(3.5)*

89.1

(3.3)*

73.3

(4.6)

74.4

(4.6)

97.6

(1.6)

98.7

(1.2)*

97.4

(1.7)

97.6

(1.6)

0.92

(0.01)*

0.94

(0.01)*

0.85

(0.01)*

0.86

(0.01)*

NW 81.8

(3.6)*

86.0

(3.2)*

88.8

(2.9)*

91.3

(2.6)*

98.1

(1.3)

98.4

(1.1)

99.6

(0.6)^

99.7

(0.5)^

0.90

(0.01)*

0.92

(0.01)*

0.94

(0.01)*

0.95

(0.01)*

Total 72.5

(1.4)*

84.0

(1.1)*

86.6

(1.0)

86.7

(1.0)

96.8

(0.5)*

98.1

(0.4)*

98.3

(0.4)

98.3

(0.4)*

0.85

(0.00)*

0.91

(0.00)*

0.92

(0.00)

0.92

(0.00)

Values are shown as Mean (SD). The * indicates significant differences from all other accelerometer placement sites. The ^ indicates

significant differences from the hip and thigh accelerometers.

SE = Sedentary, ST = Standing, LI = Lifestyle, WS = Walk slow, WF = Walk fast, JO = Jogging, CY = Cycling, EX = Exercise, SR =

Stairs, NW = Non-wear.

Page 165: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

150

Table 4.11 (cont’d.)

AG Hip = Hip-mounted ActiGraph monitor, AG Thigh = Thigh-mounted ActiGraph monitor, GE Left Wrist = GENEA monitor

placed on the left wrist, GE Right Wrist = GENEA monitor placed on the right wrist.

Page 166: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

151

Table 4.12. Activities classified into activity intensities by the Compendium and by measured METs.

Activity Code Description Compendium

METs

Compendium

Intensity

Experimentally

measured METs

[Mean (SD)]

Experimental

Intensity

Lying 07011 Lying quietly, doing nothing,

lying in bed awake, listening to

music (not talking or reading)

1.3 Sedentary 1.4 (0.7) Sedentary

Reading 09030 Sitting, reading, book,

newspaper, etc.

1.3 Sedentary 1.4 (0.6) Sedentary

Computer 09040 Sitting, writing, desk work,

typing

1.3 Sedentary 1.4 (0.6) Sedentary

Standing

07041 Standing, fidgeting 1.8 Light 1.4 (1.0) Light

Laundry 05090 Laundry, fold or hang clothes,

put clothes in washer or dryer,

packing suitcase, washing

clothes by hand, implied

standing, light effort

2.0 Light 2.1 (0.6) Light

Sweeping 05011 Cleaning, sweeping, slow, light

effort

2.3 Light 2.5 (0.5) Light

Biceps

curls

09071 Standing, miscellaneous 2.5 Light 2.0 (0.6) Light

Walk slow 17152 Walking, 2.0 mph, level, slow

pace, firm surface

2.8 Light 2.9 (0.7) Light

Walk fast 17200 Walking, 3.5 mph, level, brisk,

firm surface, walking for

exercise

4.3 Moderate 4.2 (1.1) Moderate

Cycling 02017 Bicycling, stationary, 51-89

watts, light-to-moderate effort

4.8 Moderate 4.4 (1.1) Moderate

Squats 02052 Resistance (weight) training,

squats , slow or explosive effort

5.0 Moderate 4.5 (1.2) Moderate

Page 167: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

152

Table 4.12 (cont’d.)

Stairs 17130 Stair climbing, using or

climbing up ladder (Taylor

Code 030)

8.0 Vigorous 6.8 (1.5) Vigorous

Jogging 12030 Running, 5 mph (12 min/mile) 8.3 Vigorous 8.0 (1.8) Vigorous

Non-wear -- -- -- -- -- --

Page 168: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

153

Table 4.13. Overall sensitivity, specificity, and AUC among the four accelerometer placement sites for classification of activity

intensity.

Sensitivity (% agreement) Specificity (%) AUC

AG

Hip

AG

Thigh

GE

Left

Wrist

GE

Right

Wrist

AG

Hip

AG

Thigh

GE

Left

Wrist

GE

Right

Wrist

AG

Hip

AG

Thigh

GE

Left

Wrist

GE

Right

Wrist

Non-wear 81.8

(3.6)*

86.0

(3.2)*

88.8

(2.9)*

91.3

(2.6)*

98.1

(1.3)

98.4

(1.1)

99.6

(0.6)^

99.7

(0.5)^

0.90

(0.01)*

0.92

(0.01)*

0.94

(0.01)*

0.95

(0.01)*

Sedentary 72.6

(2.8)*

92.1

(1.7)*

93.5

(1.6)

92.7

(1.6)

93.9

(1.5)*

97.0

(1.1)

97.2

(1.0)

97.2

(1.0)

0.83

(0.01)*

0.95

(0.01)

0.95

(0.01)

0.95

(0.01)

Light-

intensity

75.8

(2.3)*

93.4

(1.3)*

89.1

(1.6)

89.9

(1.6)

86.5

(1.8)*

96.3

(1.0)*

93.7

(1.3)

93.8

(1.3)

0.81

(0.00)*

0.95

(0.01)*

0.91

(0.01)*

0.92

(0.01)*

Moderate-

intensity

75.4

(3.0)*

85.0

(2.5)*

82.6

(2.7)

81.0

(2.7)

94.4

(1.6)*

97.6

(1.1)*

96.8

(1.2)

96.5

(1.3)

0.85

(0.01)*

0.91

(0.01)*

0.90

(0.01)*

0.89

(0.01)*

Vigorous-

intensity

92.3

(2.2)

92.9

(2.1)

85.9

(2.8)^

86.0

(2.8)^

97.6

(1.2)

98.6

(0.9)*

97.4

(1.3)

97.5

(1.3)

0.95

(0.01)*

0.96

(0.01)*

0.92

(0.01)

0.92

(0.01)

MVPA 87.3

(1.8)*

93.0

(1.3)*

89.4

(1.6)

88.6

(1.0)

92.4

(1.4)*

97.6

(0.8)*

95.5

(1.1)

95.3

(1.1)

0.90

(0.00)*

0.95

(0.01)*

0.92

(0.01)

0.92

(0.01)

Total 78.0

(1.3)*

90.7

(0.9)*

88.4

(1.0)

88.5

(1.0)

92.5

(0.8)*

97.2

(0.5)*

96.2

(0.6)

96.2

(0.6)

0.85

(0.00)*

0.94

(0.00)*

0.92

(0.00)

0.92

(0.00)

Values are shown as Mean (SD).

The * indicates significant differences from all other accelerometer placement sites.

The ^ indicates significant differences from the hip and thigh accelerometers.

AG Hip = Hip-mounted ActiGraph monitor, AG Thigh = Thigh-mounted ActiGraph monitor, GE Left Wrist = GENEA monitor

placed on the left wrist, GE Right Wrist = GENEA monitor placed on the right wrist.

Page 169: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

154

Figure 4.3. Comparison of dominant and non-dominant wrist accelerometer sensitivities.

0

10

20

30

40

50

60

70

80

90

100

1 2 3 4 5

Sen

siti

vity

(%

)

Feature Set

Non-Dominant

Dominant

Page 170: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

155

DISCUSSION

The purpose of this study was to develop and validate ANNs using data from

accelerometers located on several locations of the body in order to classify activity types.

Specifically, we compared the accuracy of ANNs developed for wrist-, hip-, and thigh-mounted

accelerometers as well as compared accuracy of accelerometers placed on the left and right

wrists. A secondary purpose was to assess the accuracies of the four accelerometer placement

sites for classifying specific types of activities (e.g., sedentary, lifestyle, and exercise activities)

and activity intensities and to test multiple feature sets.

The wrist-mounted accelerometers outperformed the hip- and thigh-mounted

accelerometers for total classification accuracy, achieving over 80% sensitivity in our initial

analysis and over 86% when combining similar activities into subcategories (i.e., combining

lying, reading, and computer use as sedentary). Additionally, when looking solely at the three

sedentary activities, the wrist monitors provided sensitivities of 92.7-93.5% when combined into

a single sedentary category, which was slightly higher than the thigh (92.1%) and much higher

than the hip (72.6%). The wrist accelerometer placement sites also had the highest sensitivities

for detecting exercise and lifestyle activities, although the thigh had the highest sensitivity for

classifying cycling. Also, the wrist monitors provided higher sensitivity for standing than the hip

or thigh. In direct comparison of the left and right wrist placements, we found no differences in

overall sensitivity and only very slight differences (1-5%) for specific activity types. These

small differences were statistically significant due to the large number of windows of data

(>42,000) used when determining sensitivity, but the clinical or real-world significance of the

differences between the left and right wrist accelerometer placements is likely minimal.

Page 171: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

156

Furthermore, follow-up analyses comparing dominant vs. non-dominant wrists yielded no

differences in overall classification accuracy. These findings provide strong evidence that wrist

accelerometers can be used to achieve high accuracy for recognition of a variety sedentary,

ambulatory, lifestyle, and exercise activities.

The superiority of the wrist accelerometer placements for the exercise and lifestyle

activities was expected because these activities (with the exception of squats), utilize mostly

upper-body movements. These activities would be easier to detect with monitors worn on the

wrists compared to accelerometers worn on the hip or thigh since the patterns of wrist movement

are likely more distinct than thigh or hip movement and ,therefore, would be best recognized

using pattern recognition approaches such as ANNs. The superior accuracy of the wrist

accelerometer placement sites for measurement of specific sedentary activities was initially

surprising given that thigh-mounted accelerometers have consistently yielded high accuracy for

measurement of time spent in sedentary activities as well as breaks in sedentary activities

(Kozey-Keadle, Libertine et al. 2011; Lyden, Kozey Keadle et al. 2012). However, a recent

study by Rowlands et al. described an elegant and accurate way to use a concept called the

“sedentary sphere” to identify specific types of sedentary activities from a wrist accelerometer

(Rowlands, Olds et al. 2014). Our study provides further evidence that a wrist-worn

accelerometer can provide an accurate indication of specific types of sedentary activities.

The high overall sensitivities achieved with the wrist-mounted accelerometers was also

surprising given previous research showing that wrist-mounted accelerometers are often

outperformed by monitors on other parts of the body. A study by Mannini et al. (Mannini, Intille

et al. 2013) showed higher classification accuracies of an ankle monitor (95%) compared to a

Page 172: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

157

wrist monitor (84.7%), although the overall accuracy of the wrist monitor is very similar to that

achieved in our study. Furthermore, Skotte et al. found classification accuracies of 99% for

classifying activity type using hip- and thigh-mounted accelerometers (Skotte, Korshoj et al.

2014), which is well above the classification accuracies achieved in our study. However, Skotte

et al. tested only six activities, and the authors ended up removing one activity (stair climbing)

since it had poor classification accuracy. Results from Cleland et al. (Cleland, Kikhia et al.

2013) showed very high classification accuracies for hip, wrist, and thigh accelerometers (95-

97%), but again, only seven activities were used. In short, most research comparing several

different accelerometer placement sites is limited by use of small subject numbers (i.e., < 20)

and/or small numbers of activities in their studies, limiting their comparisons of the advantages

and disadvantages of each placement site.

In a recent study, members of our research team found higher classification accuracies of

a thigh-mounted accelerometer compared to a wrist-mounted accelerometer (78% vs. 71%) for

classifying 14 activities in a laboratory-based setting (Dong, Montoye et al. 2013). However, the

accelerometers used in this study measured acceleration data in only two axes (Dong, Montoye

et al. 2013), whereas accelerometers used in the current study measured accelerations in three

planes of movement (triaxial). It is reasonable to assume that the hip and thigh, which lie closest

to the center of the body, would move mostly in the anterior-posterior and vertical planes;

conversely, the wrist, which was the most distal accelerometer attachment site tested, would

experience significant movement in the medial-lateral plane as well as the anterior-posterior and

vertical planes. Therefore, addition of a third measurement axis in the current study may have

benefitted the ANNs for the wrist accelerometers much more than the hip or thigh

Page 173: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

158

accelerometers and contributed to the much higher accuracy for the ANNs developed for the

wrist-mounted accelerometers seen in this study compared to Dong’s work. Also, Dong’s study,

as well as Cleland’s and Skotte’s, used a laboratory-based setting, which would have

questionable generalizability to a free-living environment (Gyllensten and Bonomi 2011; Trost,

Wong et al. 2012; Lyden, Keadle et al. 2013; Mannini, Intille et al. 2013). Our current study

builds off of this previous research by validating activity type recognition of ANNs developed

for wrist-, hip-, and thigh-mounted accelerometers in a simulated-free living setting, with a wide

range of activities and the ability to directly compare monitors located at these popular

placement sites.

The utility of the wrist placement sites for activity type prediction found in this study is

especially encouraging given its implementation in many studies, including the 2011-2014

NHANES data collection cycle. There is preliminary evidence that participant compliance is

improved with wear on the wrist (Troiano, McClain et al. 2014), so the wrist holds promise for

use in large studies due in part to this improved compliance but also its high accuracy of

measurement for both activity type and energy expenditure prediction seen in this study as well

as previous work. Additionally, we have found that choice of wrist (left vs. right and dominant

vs. non-dominant) does not lower accuracy of activity type prediction, which is encouraging as it

may allow participants in large studies to choose the wrist on which they wear the accelerometer.

The use of wrist-mounted monitors for sleep measurement in previous research (Kripke,

Mullaney et al. 1978; Mullaney, Kripke et al. 1980; Jean-Louis, Kripke et al. 2001) suggests that

there is potential for a single accelerometer placed on the wrist to measure physical activity,

inactivity/sedentary behavior, and sleep accurately, thereby providing a comprehensive

Page 174: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

159

measurement tool for assessing several different behavioral characteristics known to have strong

associations with health. Investigators have begun to use commercially available devices such as

the Fitbit (Fitbit Inc., San Francisco, CA) and Nike Fuelband (Nike Inc., Beaverton, OR) for

comprehensive measurement of activity and sleep, but limited evidence available does not

support their accuracy (Montgomery-Downs, Insana et al. 2012; Dannecker, Sazonova et al.

2013; Fortune, Lugade et al. 2014), and we know of no research-grade devices yet capable of

accomplishing this task. Now that we have developed algorithms to classify activity type and

predict energy expenditure from a wrist-worn accelerometer, we intend to expand our

investigations and measure sleep duration and quality as well as sedentary behavior.

While the wrist accelerometer placements performed the best in this study, the

performance of the hip and thigh accelerometer placements should not be overlooked. At over

70% for prediction accuracy when combining similar activities into categories, the hip placement

performed well, although this accuracy is not the highest achieved in the literature. As

previously discussed, Cleland et al. (Cleland, Kikhia et al. 2013) and Skotte et al. (Skotte,

Korshoj et al. 2014) both achieved over 97% accuracy for the hip-mounted accelerometer for

classifying activity type. In the current study, main weaknesses of the hip placement were

encountered in classifying sedentary behaviors, standing, and lifestyle and exercise activities. Of

these, only sitting and standing were included in the studies by Cleland and Skotte, likely

contributing to their higher accuracy of measurement. Members of our research group (Dong,

Montoye et al. 2013) used a similar activity set and achieved higher accuracy (78%) for the hip

than in the current study, which may be partially attributed to the use of a simulated free-living

setting in the current study (vs. a laboratory-based setting in the previous study). Additionally,

Page 175: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

160

our research group controlled the exact speed of the walking (2.0 and 4.0 miles/hour), jogging

(6.0 miles/hour), and stair climbing (60 steps/min) in the previous study, whereas the current

study included no such limitations on speeds, resulting in much more variability in speed of

movement and potentially lower accuracy for classifying these tasks. The discrepancy in

accuracies between the lab-based and free-living settings also indirectly shows the importance of

validating predictive algorithms in a setting similar to that in which they are intended for use,

thereby obtaining a more realistic view of their accuracy in the true free-living setting. In terms

of accurately classifying sedentary activities, the hip performed moderately well, with 72.6%

accuracy for the combined category but only 36.5-90.8% for the individual sedentary activities.

According to Table 4.10, sedentary activities were often misclassified as standing with the hip

accelerometer placement, which is not surprising given the static nature of these activities as well

as the similar hip angle seen with sitting and standing. Poor classification of sedentary behavior

by hip-worn accelerometers was also seen in studies by Lyden et al. (Lyden, Kozey Keadle et al.

2012) and Kozey-Keadle et al. (Kozey-Keadle, Libertine et al. 2011), where the hip-worn

accelerometer inaccurately predicted total sedentary time and breaks in sedentary time using the

cut-point approach to classification. Despite fairly widespread use of hip-mounted

accelerometers for measuring sedentary behavior in previous literature, our findings, along with

those of Lyden and Kozey-Keadle, suggest that hip-mounted accelerometer estimates of

sedentary behavior should be used with caution, regardless of whether cut-points or machine

learning are used for prediction.

The current study showed that a thigh-mounted accelerometer performed better than the

hip placement but worse than the wrist placements for activity type prediction. At 71.4%, the

Page 176: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

161

thigh placement had lower accuracy than achieved in Cleland’s and Skotte’s work (>95%) and

slightly lower accuracy than in our previous, lab-based study (78%). Our lower measurement

accuracy is likely due to the addition of more activities than the other studies and use of a

simulated free-living setting. Notably, accuracy of the thigh accelerometer increased from

71.4% to 84.0% upon combining the categories for sedentary, lifestyle, and exercise activities.

This increase was due mostly to improvement in accuracy of sedentary and lifestyle activity

measurement accuracy upon combining into categories. The inability to measure individual

sedentary activities accurately was expected since the angle and movement of the thigh is very

similar for lying and seated activities. However, of greater importance is that the thigh

accelerometer achieved high accuracy differentiating sedentary activities from non-sedentary

activities and was able to differentiate between sedentary activities and standing (as seen in

Table 4.8), which the hip-mounted accelerometer was unable to accomplish. The differentiation

between sedentary activities and standing is important for measuring total sedentary time as well

as breaks in sedentary behavior, and the high accuracy we found for the thigh is in accordance

with previous studies showing excellent accuracy of thigh-mounted accelerometers for

measuring sedentary time and breaks in sedentary behavior (Grant, Ryan et al. 2006; Lyden,

Kozey Keadle et al. 2012). The high overall accuracy of thigh-mounted accelerometers for

classifying sedentary and non-sedentary activities achieved in this study provides further

rationale for their use in measuring PA as well as SB in free-living settings.

Grouping of activities by intensity resulted in highest sensitivity and AUC by the thigh

accelerometer placement and slightly lower values for the wrist placements (Table 4.13). Given

that the thigh accelerometer placement performed best for estimation of energy expenditure

Page 177: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

162

(Chapter 3), the higher performance of the thigh placement for prediction of activity intensity

provides further evidence that the thigh accelerometer placement shows higher measurement

accuracy than the hip or wrist placements for prediction of the energy cost of activities. In

studies specifically focused on identifying time spent in different activity intensities (i.e., for

identifying time spent sedentary or time spent in MVPA), the thigh accelerometer placement

may be optimal. However, the wrist placements appear best if trying to identify individual types

of activities.

Due to the different strengths of the monitors located on the hip, thigh, and wrists, choice

of placement should depend on the population of interest as well as the specific research

question. ANNs for hip-mounted accelerometers classified ambulatory activities and stair use

well in this study and have previously been shown to provide highly accurate estimates of energy

expenditure (Staudenmayer, Pober et al. 2009) (Chapter 3), but studies in pregnant or obese

populations should consider avoiding use of a hip accelerometer due to monitor tilt that can

occur (Feito, Bassett et al. 2011). For researchers interested specifically in measuring sedentary

behavior, the thigh-mounted accelerometer may be preferable due to high accuracy of

classification seen in this study as well as in previous work by Lyden et al. and Kozey Keadle et

al. (Kozey-Keadle, Libertine et al. 2011; Lyden, Kozey Keadle et al. 2012). In contrast, those

seeking to maximize compliance or those interested in recognition of activity types or sleep may

benefit from use of a wrist-mounted accelerometer (Jean-Louis, Kripke et al. 2001).

Page 178: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

163

Strengths and limitations

This study had several limitations that must be addressed. First, our sample was

relatively homogenous and consisted mainly of younger adults who had a lower BMI than the

general population. Individuals larger or smaller than those tested, or individuals who perform

activities at a different intensity than performed in the study, may not be measured well using the

current ANN algorithms. Additionally, our study provided only a sample of activities that

people may perform on a daily basis, and therefore our models cannot necessarily be used for

comprehensive assessment of everyday activities. Finally, our study did not record the walking

or jogging speeds performed by participants, which may have been useful for evaluating the

differences between these activities. However, MET values recorded for the slow walk averaged

2.9 METs, while the fast walk elicited an average MET value of 4.2 METs, providing evidence

that the two walking speeds were distinct activities that fell into different intensity categories

(i.e., light vs. moderate). This study also had several notable strengths. First, our models were

created and tested using more than 42,000 five-second windows of data from 39 participants,

which is larger than many data sets used in previous studies. Second, validation studies cannot

reasonably test all activities that a person could perform, so it is important to pick a set of

activities that encompasses a range of intensities and types as well as activities commonly

performed in daily life. Our study incorporated a diverse collection of activities, including

commonly performed activities such as walking and several sedentary activities, as well as

lifestyle and exercise activities of varying intensities. Additionally, our use of a simulated-free

living setting is a major advantage as it allowed for much greater variability in the movement

patterns and intensities of the activities performed as well as not requiring steady-state to be

Page 179: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

164

achieved, as is usually the case for laboratory-based protocols. Our inclusion of wrist-, hip-, and

thigh-mounted accelerometers is also a study strength as it allowed for direct comparison of the

accuracy of models developed for each placement site. Finally, our use of Microsoft Excel for

data processing cleaning, and analysis and R statistical software for model creation and testing

provides further evidence of the accessibility of machine learning to those without access to

highly powered statistical software or computer programming experience. Staudenmayer et al.

(Staudenmayer, Pober et al. 2009) and Lyden et al. (Lyden, Keadle et al. 2013) provide simple

details on the code used for developing and testing ANNs using R software.

Conclusions

In conclusion, we tested four accelerometers located on the left and right wrists, right hip,

and right thigh for their utility in classifying activity type across a wide range of activities

performed in a simulated free-living setting. Overall sensitivity was moderately high at 66-81%,

which improved to 73-87% when condensing similar activities into categories. Both wrist

accelerometer placement site outperformed the hip and thigh placements for total classification

accuracy as well as in many of the individual activities, providing further support of the wrist

placement for use in large epidemiologic and surveillance studies. Our study builds upon

previous work by using a simulated free-living setting, which enhances generalizability of the

findings as well as the predictive models created. In the future, we intend to expand our

algorithms to measure sleep quality and duration and validate the algorithms in a larger, more

diverse sample.

Page 180: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

165

CHAPTER 5

VALIDATION AND COMPARISON OF ACCELEROMETERS WORN ON THE

WRISTS, HIP, AND THIGH FOR MEASURING SEDENTARY BEHAVIOR

ABSTRACT

The purpose of this study was to validate and compare the accuracy of activity type prediction

models developed for accelerometers placed on the wrists, hip, and thigh for measurement of

total time spent in sedentary behavior and breaks in sedentary behavior. METHODS: Forty four

healthy adults participated in a 90-minute simulated free-living activity protocol, in which

participants performed a total of 14 sedentary, ambulatory, lifestyle, and exercise activities for 3-

10 minutes each. Participants dictated the order, duration, and intensity of activities, which were

recorded using direct observation (for a criterion measure of total time spent in sedentary

behavior and breaks in sedentary behavior). All time spent in lying, reading, and computer use

were summed to obtain a measure of total time spent in sedentary behavior. Any transition from

one of these three activities to a non-sedentary activity was recorded to measure breaks in

sedentary behavior. Four accelerometers were worn (right and left wrists, right hip, and right

thigh) in order to predict total time spent in sedentary behavior and breaks in sedentary behavior

compared to that measured by direct observation (using the activity type prediction models

developed in our previous research [Chapter 4]). We used and tested three break intervals (5-,

30-, and 60-seconds) in order to determine the best method of characterizing breaks in sedentary

behavior from an accelerometer. Differences among accelerometer-predicted and criterion-

measured total time spent in sedentary behaviors and breaks in sedentary behavior were

evaluated using repeated measures analysis of variance and by non-overlap of 95% confidence

Page 181: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

166

intervals. RESULTS: For total time spent in sedentary behavior, all four accelerometers

provided similar estimates to direct observation, but the wrist accelerometers had the lowest error

for prediction (2.8-3.1 minutes), and the hip had the highest error (7.2 minutes). For breaks in

sedentary behavior, the 30-second break interval provided the greatest predictive accuracy.

Using this interval, the hip and left wrist accelerometer produced estimates similar to that

measured by direct observation, but the thigh and right wrist underestimated breaks in sedentary

behavior by 15-17%. CONCLUSIONS: Hip and left wrist accelerometer placements provided

the highest overall accuracy for measuring the multiple constructs of sedentary behavior These

findings lie in contrast to previous research showing the utility of thigh accelerometers for

measurement of sedentary behavior and therefore warrant confirmation. The superiority of the

left wrist accelerometer over the right wrist accelerometer provides support for the convention

that accelerometers be placed on the non-dominant wrist for sedentary behavior measurement.

Page 182: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

167

INTRODUCTION

Physical activity (PA) has long been recognized for its beneficial effects on many health

indices, such as lowering risk of obesity, cardiovascular disease, and certain cancers, just to

name a few (Morris, Clayton et al. 1990; King and Tribble 1991; Thune and Furberg 2001).

Correspondingly, the Physical Activity Guidelines Advisory Committee issued a report in 2008

detailing evidence-based recommendations that adults should attain 150 minutes/week of

moderate-intensity PA or 75 minutes of vigorous-intensity PA to experience health benefits

(2008). Sedentary behavior (SB) has traditionally been viewed as a lack of PA, and people were

considered sedentary if not meeting the national PA recommendations (Pate, O'Neill et al. 2008).

However, it is possible to meet PA recommendations and still spend substantial time engaged in

sedentary activities, (i.e., driving, using a computer, watching TV), a group Owen et al. called

the “active couch potatoes” (Owen, Healy et al. 2010). More recently, epidemiologic and

laboratory-based studies have started uncovering associations between high amounts of SB and

diminished metabolic, cardiovascular, and bone health as well as an increased risk of obesity,

some cancers, and all-cause mortality (Zerwekh, Ruml et al. 1998; Hu, Li et al. 2003; Hamilton,

Hamilton et al. 2004; Hamilton, Hamilton et al. 2007; Howard, Freedman et al. 2008; Schrage

2008; Katzmarzyk, Church et al. 2009; Owen, Healy et al. 2010) Notably, these associations are

largely independent of level of PA. Additionally, it appears that the way SB is accrued may

influence its effects on health, with longer periods of SB being worse than SB broken up

periodically by short, non-sedentary activities (Healy, Dunstan et al. 2008; Owen, Healy et al.

2010).

Page 183: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

168

Despite emerging findings of the potential health risks of SB, there is currently

insufficient research to allow for evidence-based recommendations to be created with regard to

SB. Given that adults spend well over 50% of their waking hours in SB (Matthews, Chen et al.

2008), it is important to accurately measure SB in order to better determine health risks

associated with SB and develop evidence-based recommendations for SB in order to improve

health.

Accelerometer-based activity monitors have become a widely used and accepted method

for PA and energy expenditure measurement due to their objectivity, relatively low participant

and researcher burden, and high measurement accuracy in numerous validation studies

conducted in laboratory-based and free-living environments (Welk 2002). Traditionally,

accelerations of the body were recorded and translated into ‘activity counts,’ which correspond to

magnitude of acceleration. Activity counts could then be placed into simple linear regression

equations to estimate energy expenditure and activity intensity (Montoye, Washburn et al. 1983;

Freedson, Melanson et al. 1998). A count cut-point of <100 counts/minute has been widely used

as a threshold for estimating SB using accelerometers; however, this cut-point has been shown to

provide inaccurate estimates of SB and an inability to measure breaks in SB in free-living settings

(Kozey-Keadle, Libertine et al. 2011; Lyden, Kozey Keadle et al. 2012). Other cut-points ranging

from 50-250 counts/minute have been used to define SB with varying degrees of accuracy (Kozey-

Keadle, Libertine et al. 2011; Lyden, Kozey Keadle et al. 2012),. Regardless of which cut-point is

chosen to designate SB, the cut-point method has several notable fallacies. First, the cut-point

method does not allow for differentiation of SB from accelerometer non-wear without establishing

additional data reduction rules (ex., how many consecutive minutes of 0 counts should count as

Page 184: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

169

non-wear), which can affect estimates of SB and PA (Masse, Fuemmeler et al. 2005). Moreover,

the cut-point approach would likely classify standing as sedentary (since little movement occurs

when standing), but several studies have provided evidence that standing elicits a different

physiologic response than sitting or lying (Bey and Hamilton 2003). Additionally, standing has

been shown to be inversely associated with all-cause mortality and cardiovascular disease,

especially in individuals not meeting PA recommendation (Katzmarzyk 2014). Therefore, an

accurate measurement tool for SB needs to be able to differentiate between non-wear and SB as

well as standing and sitting/lying.

Due to limitations of the cut-point approach to measuring SB as well as energy

expenditure, researchers have turned to more advanced data processing techniques, such as

machine learning models, to improve accuracy of activity measurement. These studies show

dramatically improved measurement of energy expenditure (Staudenmayer, Pober et al. 2009;

Freedson, Lyden et al. 2011; Lyden, Keadle et al. 2013) and highly accurate classification of

activity type from a hip-mounted accelerometer (Pober, Staudenmayer et al. 2006; Staudenmayer,

Pober et al. 2009; Freedson, Lyden et al. 2011). However, to our knowledge, only one study has

used machine learning models developed for a hip-mounted accelerometer specifically to measure

total time in SB and breaks in SB. In this study, Lyden et al. found that total time spent in SB and

breaks in SB could be measured accurately in a free-living setting, but only when the machine

learning model was also developed in the free-living setting in which it was subsequently used.

Therefore, there is encouraging, but by no means conclusive, evidence that machine learning

models can improve measurement of SB using hip-mounted accelerometers.

Page 185: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

170

Despite the common use of hip-mounted accelerometers, there are advantages of wearing

accelerometers on other parts of the body. For example, tilt angle of a hip-mounted accelerometer

will affect its measurement accuracy, which can pose problems when trying to measure pregnant or

overweight individuals (Feito, Bassett et al. 2011; DiNallo, Downs et al. 2012). Additionally, the

introduction of machine learning modeling to accelerometer data has dramatically improved

measurement accuracy of accelerometers worn in various body locations, such as the wrist, thigh,

ankle, lower back, and upper arm (Preece, Goulermas et al. 2009). Studies aiming to classify

activity type using accelerometers placed on the wrist and thigh have consistently shown

accuracies of >70% and often accuracies above 90% in laboratory-based studies (Zhang, Rowlands

et al. 2012; Cleland, Kikhia et al. 2013; Mannini, Intille et al. 2013; Skotte, Korshoj et al. 2014).

These two measurement sites are appealing not only for their high activity classification accuracy

but also for their utility in measuring lifestyle behaviors such as sleep quality (wrist) (Webster,

Kripke et al. 1982; Jean-Louis, Kripke et al. 2001) and SB (thigh) (Kozey-Keadle, Libertine et al.

2011; Lyden, Kozey Keadle et al. 2012; Skotte, Korshoj et al. 2012; Skotte, Korshoj et al. 2014),

as well as their potential to improve participant compliance. Thigh-mounted accelerometers have

been used with high accuracy for measuring total time spent in SB as well as breaks in SB and are

often used as a criterion measure of SB in free-living environments. However, methods developed

to classify SB from a thigh accelerometer provide accurate estimates of step count (Maddocks,

Petrou et al. 2010; Harrington, Welk et al. 2011) but do not allow for detailed information on PA

behaviors and appear to underestimate energy expenditure (Harrington, Welk et al. 2011). It

would be useful to have a single measurement tool that could measure a variety of activity types as

well as SB in a free-living setting; to our knowledge, no such method has yet been validated.

Page 186: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

171

Additionally, the wrist-mounted accelerometer has not yet been validated for measurement of total

time spent in SB or breaks in SB.

We have previously developed and validated machine learning algorithms for hip-, wrist-,

and thigh-mounted accelerometers that can classify activity type with accuracies above 70% for the

hip and above 80% for the wrists and thigh, but these have yet to be validated for measurement of

SB (Chapter 4). Therefore, the primary purpose of our study was to develop, validate, and

compare the accuracy of machine learning algorithms created from hip-, wrist-, and thigh-mounted

accelerometers for measuring 1) total time spent in SB and 2) breaks in SB in a simulated free-

living environment. A secondary purpose was to compare accelerometers located on the left and

right wrists for prediction of total time spent in SB as well as breaks in SB.

Page 187: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

172

METHODS

Summary of protocol

Participants came to the Human Energy Research Laboratory to participate in a 90-

minute simulated free-living protocol, for which they performed a total of 14 sedentary,

ambulatory, lifestyle, and exercise activities while wearing a total of four accelerometers

(placed on the right hip, right thigh, and both wrists). Each activity was performed for between

3-10 minutes, with the order, duration, and intensity of activities left up to participants. During

the protocol, the order and duration of participants’ activities as well as total time spent in SB

and breaks in SB were recorded by a trained observer.

Participants

A total of 44 adults (22 male, 22 female) were recruited from the surrounding area of

East Lansing, MI via email, flyers, and word of mouth for participation in this study. In order

to be eligible for participation, participants had to fulfill three criteria 1) they had to be free of

health conditions preventing them from being able to safely perform moderate- or vigorous-

intensity physical activities, 2) they could not have an orthopedic limitations that would

invalidate the use of accelerometry, and 3) they had to fall within the age range of 18-44 years.

Prior to participant recruitment, this study was approved by the Michigan State University

Institutional Review Board.

Instrumentation

Each participant wore four accelerometer-based activity monitors in this study: two

ActiGraph GT3X+ accelerometers and two GENEActiv accelerometers. Additionally, a portable

Page 188: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

173

digital assistant (PDA) computer was used by observers to record the activities performed during

the protocol. The accelerometers and PDA were synchronized to an external clock before each

test; descriptions of the accelerometers and PDA follow.

The acceleration data for all four accelerometers were time stamped and stored within the

monitors until they could be downloaded to a computer for analysis. Additionally, the

accelerometers were oriented so that the x-axis was the vertical axis, the y-axis was the medial-

lateral axis, and the z-axis was the anterior-posterior axis.

ActiGraph accelerometers

The ActiGraph (ActiGraph LLC, Pensacola, FL) is a commonly used accelerometer for

activity measurement, and there is an abundance of literature regarding its reliability and validity

for measurement of PA (Freedson, Melanson et al. 1998; McClain, Sisson et al. 2007). Two

GT3X+ models were worn by each participant during the study, one on the midline of the right

thigh and adhered to the leg (with hypoallergenic sticky tape), and the other placed on the right hip

at the anterior axillary line (with an elastic belt). The ActiGraph GT3X+ records raw

accelerations of up to ± 6 times the gravitational force (6g) in three dimensions of movement. For

the current protocol, the accelerometers recorded at a rate of 40 samples per second (40 Hz).

GENEA accelerometers

The GENEActiv accelerometer (Activinsights Ltd, Kimbolton, Cambridgeshire, UK) has

undergone preliminary validations for PA measurement (Esliger, Rowlands et al. 2011) as well

as activity type classification (Zhang, Rowlands et al. 2012). The GENEA records raw

Page 189: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

174

accelerations of up to ± 6g in three axes of movement, and the GENEA monitors used in this study

were set to record acceleration data at a rate of 20 Hz. Participants wore two GENEA

accelerometers which were fastened to the dorsal side of each wrist using a watch strap supplied by

the manufacturers (Esliger, Rowlands et al. 2011).

iPAQ portable digital assistant and direct observation

Direct observation (DO) was conducted using an HP iPAQ personal digital assistant (PDA)

(HP Development Company, Palo Alto, CA) to obtain a criterion measure for total time spent in

SB and breaks in SB. During the study protocol, a trained observer used a PDA with BEST

software developed based on the Children’s Activity Rating Scale protocol (Puhl, Greaves et al.

1990). The observer used the codes T1-T14 to represent the 14 activities in the visit and recorded

the activities being performed continuously as they occurred throughout the visit. The codes T1-T3

represented the three sedentary activities (lying, reading, and computer use) in the visit, and these

were used to determine total time spent in SB and breaks in SB. Inter-rater reliability for DO was

above r=0.90 for this study.

Procedure

Upon arriving at the Human Energy Research Laboratory, details of the study were

discussed with each participant. Written informed consent was obtained, and a physical activity

readiness questionnaire was administered to ensure that the participant had no contraindications to

engaging in PA. After consenting, participant weight and height were measured (to the nearest 0.1

kg and 0.1 cm, respectively) according to standardized methods (Malina 1995). Body mass index

(BMI) was calculated by dividing body weight by the square of height (kg/m2). Participant age

Page 190: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

175

was assessed by asking participants to state their age in years, and handedness (left or right) was

determined by asking participants which hand they prefer to use for the majority of everyday

activities.

After being fitted with the four accelerometers, participants performed 14 activities which

were meant to include many different types and intensities of activities that would likely be seen in

a free-living environment. (shown in Table 5.1). Ambulatory activities (walking and jogging) are

common in accelerometer validation literature; we added the sedentary, exercise, and lifestyle

activities to determine the potential for the four accelerometers to measure SB accurately in a

setting where a variety of activities was being performed, as is normally seen in free-living

environments. Additionally, we added an activity where participants removed the accelerometers

so that the ANNs would be able to recognize non-wear, which is important to be able to detect in

free-living environments for compliance purposes and for differentiation of non-wear from SB.

Page 191: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

176

Table 5.1. Activities performed during the simulated free-living protocol.

Activity

Category Activity

Activity

Intensity Description of Activity*

Sedentary

behaviors

(SB)

Lying down (T1) Sedentary Lying on a mat on the floor

Reading (T2) Sedentary Reading a magazine article while

sitting at a table

Computer (T3) Sedentary Sitting and playing a computer game

that involves mouse clicking and typing

Standing

(ST) Standing (T4) Light** Standing still with arms at sides

Lifestyle

(LI)

Laundry (T5) Light Folding towels and putting them in a

laundry basket

Sweeping (T6) Light Sweeping confetti into piles

Leisure walk

(LW) Walking slow (T7) Light

Walking at a self-selected ‘slow’ pace

in a hallway

Brisk walk

(BW) Walking fast (T8) Moderate

Walking at a self-selected ‘brisk’ pace

in a hallway

Jogging

(JO) Jogging (T9) Vigorous

Jogging at a self-selected pace in a

hallway

Cycling

(CY) Cycling (T10)

Moderate/

Vigorous

Cycling on a cycle ergometer at a self-

selected cadence of 50-100 rpm with 1

kg resistance

Stair use

(SU)

Stair climbing and

descending (T11)

Moderate/

Vigorous

Walking up and down a flight of stairs

at a self-selected pace

Exercise

(EX)

Biceps curls (T12) Light Standing still while doing biceps curls

with a 3-lb. weight in each hand

Squats (T13) Moderate

With feet shoulder-width apart,

bending at the knees (to a 90° angle)

while holding an unweighted broom

behind the head

Non-wear

(NW)

Non-wear of

accelerometer (T14) N/A Not wearing the accelerometer

* Activity order, intensity, and duration (3-10 minutes) were left up to participants.

** Standing has traditionally been considered SB; however, recent literature suggests that standing

should be considered light-intensity instead of SB due to the differential physiologic effects of

standing as compared to sitting/lying (Owen, Healy et al. 2010).

Participants completed the 14 activities (shown in Table 5.1) in a 90-minute, simulated

free-living protocol which took place within the Human Energy Research Laboratory and in a

Page 192: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

177

building stairwell and hallway. The 14 activities were described to each participant prior to the

start of the protocol, and some of the less familiar activities (e.g., squats) were demonstrated to

ensure understanding. Participants completed each of the 14 activities for at least three minutes

and for no more than 10 minutes, but the order, intensity, and duration of the activities were left up

to each participant. Participants were also free to perform activities more than once if they so

chose. A research assistant directly observed and recorded each activity on a handheld PDA

computer while activities were being performed.

Additionally, the research assistant periodically updated participants on which activities

they still needed to complete. The non-wear activity was saved until the end of the protocol so that

participants would not spend a significant portion of the time trying to remove and reattach the

accelerometers. For this study, direct observation (DO) served as the criterion measure of total

time spent in SB and breaks in SB. Upon completion of the protocol, participants were given a

$35 Target® gift card.

Data reduction and modeling

Artificial neural networks

Artificial neural networks (ANNs) are nonlinear models which predict an outcome or

dependent variable y (e.g., energy expenditure or activity type) using a number of inputs x1…xk,

where k is the number of features used to predict y. A graphical depiction of the ANNs created in

the current study can be seen in Figure 5.1. The ANNs were used in our previous work (Chapter 4)

for predicting activity type, which were then used in the current study to predict total time in SB

and breaks in SB. For activity type classification, the ANNs functioned similar to a logistic

Page 193: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

178

regression model. Setting the activity types as the nominal values a1…a14, the ANN model can be

seen in Equation 1.

Equation 1: ( ) ( ∑ ( ∑ )

In Equation 1, Pr is probability, C is a constant chosen so that Pr(y=a1)+…+Pr(y=a14)=1,

w are the weights of the input features, and H is the number of hidden layers. For each activity,

values closer to 1 represented a higher likely that the activity was being performed. The activity

with the value closest to 1 was chosen as the predicted output by the ANN In accordance with

previous research, our models contained only one hidden layer (Preece, Goulermas et al. 2009;

Staudenmayer, Pober et al. 2009; Trost, Wong et al. 2012).

After classifying into specific activity types, the three sedentary activities (lying, computer

use, and reading) were collectively categorized as SB to allow for prediction of total time spent in

SB. Likewise, the 10 non-sedentary activities (standing, laundry, sweeping, walk slow and fast,

jogging, cycling, stair use, biceps curls, and squats) were collectively classified as non-SB in order

to predict breaks in SB. Non-wear was classified into its own category and later removed from the

dataset since there is no way to tell if a person is sedentary or non-sedentary if the accelerometer is

not being worn.

Page 194: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

179

Figure 5.1. ANN for predicting activity type and sedentary behavior.

Figure 5.1 legend

* The number of input features was 38, as described in Table 5.2. Additionally, three hidden

units are shown in Figure 5.1 for simplicity, but 15 hidden units were used for construction of

the ANNs.

Accelerometer signal features (one of each per axis, three total of each per accelerometer)

1. Mean = mean 2. Var = variance

3. Cov = covariance 4. Min = minimum

5. Max = maximum 6. MeanOR = mean accelerometer orientation

7. VarOR = variance of

accelerometer orientation

8. 10th %ile = 10

th percentile

9. 25th

%ile = 25th percentile 10. 50

th %ile = 50

th percentile

11. 75th

%ile = 75th percentile 12. 90

th %ile = 90

th percentile

Participant characteristics features

13. Ht = participant height 14. Wt = participant weight

Non-feature abbreviations

T1-T3 are sedentary activities, and T4-T13 are non-sedentary activities.

S = summations of the input layer in the hidden units

U = activation function for the hidden layer

W1 = the weight vectors for each of the inputs

W2 = the weight vectors for each of the summations

Page 195: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

180

The ANNs were created and tested using a leave-one-out cross-validation. In this

approach, data from all but one participant were used to estimate the weights for each input feature

for predicting activity type. Then, the ANN was tested on the data from the participant left out of

the training phase by supplying the input features and comparing the predicted activity type from

the ANNs to the recorded activity type from DO. The leave-one-out cross validation is an iterative

approach and was repeated with each participant’s data used as the testing data once, therefore

obtaining an ANN for activity type for each participant in the study. The weights determined from

each iteration of the leave-one -out validation were averaged to obtain a final ANN. This process

was conducted separately for each accelerometer, resulting in four distinct ANNs.

The ANNs were developed with the intention to predict activity type, which could then be

used to estimate total time spent in SB as well as breaks in SB. In accordance with previous

research, we chose to use five-second windows for creation and testing of our ANNs (Preece,

Goulermas et al. 2009). Table 5.2 provides a list of the 38 features tested and used in the current

analyses. The 36 accelerometer features (12 features for each of the three axes) are time-domain

features that are simple to compute and have been used previously as inputs into machine learning

algorithms. Additionally, we included height and weight to account for different body sizes.

Page 196: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

181

Table 5.2. Features used for EE and activity type prediction.

Feature

number

Feature used Formula for calculating feature

1-3* Mean acceleration signal (

)

4-6* Variance of acceleration signal

∑ ( )

7-9* Covariance of acceleration signal ∑ ( ) ( ( ) )]

10-12* Minimum of acceleration signal ( )

13-15* Maximum of acceleration signal ( )

16-18* 10th percentile of acceleration

signal For every 100 accelerations, arrange in order from smallest to largest and pick the 10

th

value

19-21* 25th percentile of acceleration

signal For every 100 accelerations, arrange in order from smallest to largest and pick the 25

th

value

22-24* 50th percentile of acceleration

signal For every 100 accelerations, arrange in order from smallest to largest and pick the 50

th

value

25-27* 75th percentile of acceleration

signal For every 100 accelerations, arrange in

order from smallest to largest and pick the 75th

value

28-30* 90th percentile of acceleration

signal For every 100 accelerations, arrange in

order from smallest to largest and pick the 10th

value

N/A Accelerometer orientation

(needed for calculating features

31-36) ( )

(

√(

)

)

31-33* Mean accelerometer orientation (

)

34-36* Variance of accelerometer

orientation ∑ ( )

37 Participant height N/A

38 Participant weight N/A

Ax is the acceleration in the direction of the x-axis.

*signifies that one feature is included for each of the three accelerometer axes. The formulas

shown are for the x-axis, but the formulas for the y-and z-axes are similar.

Page 197: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

182

Assessing sedentary behavior using accelerometers

The ANNs were created in order to classify 10 different activity categories. In our initial

testing of the ANNs (Chapter 4), we found that they correctly classified sedentary activities 72.6%,

92.1%, 93.5%, and 92.7% for the hip, thigh, left wrist, and right wrist accelerometer placements,

respectively. However, higher classification accuracy for sedentary activities does not necessarily

ensure better accuracy for predicting total time spent in SB or breaks in SB. Therefore, in the

current study, total time spent in SB for each participant was estimated using each accelerometer.

ANNs from each of the four accelerometers predicted the activities being performed throughout

the protocol, and all time spent lying, reading, or in computer use was summed to obtain a

prediction for total time spent in SB. Since each accelerometer predicted activity type separately

from the other accelerometers, we obtained four estimates of total time spent in SB for each

participant.

Similarly, breaks in SB were assessed for each participant and separately for each

accelerometer placement. A break in SB has been defined in previous research as when an interval

classified as SB is followed by an interval classified as a non-sedentary activity. We classified a

break in SB using three different lengths of time that a non-sedentary activity must occur to

constitute a break (we call this a break interval). First, since previous research using

accelerometers to measure SB uses 60-second break intervals, we first defined a break in SB as

when 12 consecutive 5-second windows (12*5 = 60 seconds) of a sedentary activity were followed

by 12 consecutive windows of a non-sedentary activity. Second, a shorter break in SB (i.e., < 60

seconds) might be physiologically meaningful but may be missed if using a 60-second break

interval. Therefore, we also evaluated the accuracy of using the two shorter intervals for

Page 198: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

183

measuring breaks in SB (30 seconds and 5 seconds). Using a 30-second break interval for

estimating breaks in SB, we defined a break as six consecutive windows of a sedentary activity

followed by six consecutive windows of a non-sedentary activity. Using a 5-second break interval

for estimating SB breaks, we defined a break as one window of sedentary activity followed by one

window of non-sedentary activity.

Direct observation

Direct observation has been used successfully as a criterion measure of SB in previous

studies conducted in free-living settings (Lyden, Petruski et al. 2013) and served as our criterion

measure for the current study. Data on activities performed were recorded on a handheld PDA

using the BEST observation software. Using this software, activities performed during the visit

were coded as T1-T13, as shown in Table 5.1.

As the final activity in the visit, participants took off their accelerometers and set them on a

table, and then the next 3-10 minutes was recorded as non-wear (T14) while the accelerometers sat

on the table. Any activity coded as non-wear was not included when analyzing SB, since, by

definition, we could not know if participants are engaging in SB if the accelerometer is not being

worn. Exclusion of non-wear was necessary in order to determine the real-world suitability of the

ANNs for measurement of SB. Additionally, as participants transitioned from one activity to

another, we coded this time between activities in a special transition category (T15).

The recording of activities using DO took place continuously and in real time. Research

assistants were trained to record an activity change as closely as possible to the moment it

occurred. After collection, these DO data were synchronized with the accelerometer data so that

Page 199: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

184

each five-second window of accelerometer data was matched to the actual activity performed

during that window. In most cases, only one activity occurred during a given five-second window.

However, when transitioning between activities, two activities could occur in the same window. If

this occurred, the window was automatically recoded as a transition. Additionally, we used the

transition category to define all time between activities, such as walking from one activity to

another or making an equipment adjustment between activities. Thus, transitions did not represent

a specific activity type but instead involved walking, standing, etc. that occurred at the end of one

activity and before the next started. We did not include transitions as a separate activity in the

ANN creation but instead removed them from the DO and accelerometer datasets prior to creation

of the ANNs and before prediction of total time spent in SB.

However, breaks in SB only occurred during the times coded as transitions in the dataset

(e.g., transitioning from reading to jogging would represent a break in SB). Therefore, we added

the transition data back to the dataset after creation of the ANNs but before testing the ANNs for

their prediction of breaks in SB. For DO, any transition from a sedentary to a non-sedentary

activity was considered a break in SB, no matter how short the transition may have been.

Conversely, for the accelerometers, we predicted breaks in SB in three ways (using 5-, 30-, and 60-

second break intervals), as described in the previous section.

Statistical analyses

A criterion value of total time spent in SB was assessed for each participant using DO and

averaged for the entire sample. Similarly, estimates of SB from each of the four accelerometers

were calculated for each participant and averaged together for the entire sample. Differences

between criterion-measured and accelerometer-estimated total time spent in SB were evaluated

Page 200: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

185

using repeated measures analysis of variance (RMANOVA). If significant differences were

revealed by the RMANOVA, post hoc dependent t-tests were conducted, with a least significant

difference (LSD) correction used to account for multiple comparisons. Additionally, root mean

square error (RMSE) values and their 95% confidence intervals (CIs) were calculated for predicted

vs. measured total time spent in SB for each of the four accelerometers. Significant differences for

RMSE among monitor locations were determined by non-overlap of a 95% CI with the mean from

another accelerometer location.

For breaks in SB, criterion-measured breaks were also obtained for each participant using

DO and averaged for the entire sample. Estimates of breaks in SB from each accelerometer were

obtained separately for five-, 30-, and 60-second break intervals for each participant and averaged

for the sample. Differences among DO, the four accelerometers, and the three windows were

evaluated with RMANOVA, and differences were evaluated using post hoc tests and an LSD

correction. Moreover, RMSE values and their 95% CIs were computed to compare predicted to

measured breaks in SB, with non-overlap of a 95% CI with the mean from another accelerometer

location or break interval indicative of statistically significant differences.

Power analysis

We desired 80% power to detect a difference of at least moderate effect size (ES=0.5)

among accelerometers and the criterion measure. Therefore, with the α level set at α = 0.05, we

needed 34 participants to be sufficiently powered to detect this difference. We chose to

oversample by 10 participants in order to have adequate sample size despite an expected loss of a

Page 201: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

186

few participants due to the possibility of equipment malfunction, especially when using multiple

accelerometers and a handheld computer.

Page 202: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

187

RESULTS

Of the 44 participants who participated in study, significant data loss occurred for the

thigh accelerometer in two participants, resulting in their exclusion from the data analysis.

Additionally, the portable metabolic analyzer (used to address a study aim not part of the current

manuscript) malfunctioned in three participants, resulting in premature termination of the

protocol and exclusion of their data from the analysis. Therefore, 39 participants with viable

data were included in the final data analysis. Sample demographics included in the analyses are

displayed in Table 5.3.

Table 5.3. Demographic characteristics of participants enrolled in study.

All (n=39) Males (n=19) Females (n=20)

Age (years) 22.1 (4.3) 23.7 (5.0) 20.5 (2.7)

Weight (kg) 72.4 (16.2) 84.5 (13.1) 60.8 (8.9)

Height (cm) 171.4 (10.1) 179.1 (7.7) 164.1 (5.7)

BMI (kg/m2) 24.4 (3.6) 26.3 (3.4) 22.5 (2.6)

Data are displayed as mean (SD).

Predictions of total time spent in SB are shown in Figure 5.2. Overall, participants spent an

average of 20.7 minutes engaged in SB during the visit, according to DO. The hip

accelerometer tended to underpredict SB by 7.9% (19.1 minutes from the hip vs. 20.7 minutes

from DO), but this difference did not reach statistical significance. The estimates of total time

spent in SB predicted by the accelerometers placed on the thigh and both wrists were not

significantly different from that measured by DO. Additionally, we rearranged the data to

compare dominant and non-dominant wrist placements, but neither were significantly different

from DO-measured total time spent in SB.

Page 203: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

188

Figure 5.2. Predictions of total time spent in SB compared to a criterion measure (DO).

AG Hip = Hip-mounted ActiGraph monitor, AG Thigh = Thigh-mounted ActiGraph monitor,

GE Left Wrist = GENEA monitor placed on the left wrist, GE Right Wrist = GENEA monitor

placed on the right wrist.

Although there were no significant differences among the four accelerometers compared

to DO for predicting total time spent in SB, there was considerable variation in RMSE (Table

5.4), ranging from 2.8 minutes with the accelerometer on the left wrist to 7.2 minutes with the

hip-mounted accelerometer. Each monitor placement site had significantly different RMSE

values than the other three, but it is notable that the two wrist accelerometer placements had

RMSE values that were 49-61% lower than the RMSE values achieved with the hip and thigh

accelerometer placements. The left wrist placement had significantly lower RMSE than the right

wrist placement; similarly, the non-dominant wrist placement had significantly lower RMSE

than the dominant wrist placement.

02468

1012141618202224262830

Tota

l Sed

entr

y B

ehav

ior

(min

)

Page 204: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

189

Table 5.4. Root mean square error for prediction of total time spent in SB and breaks in SB.

Accelerometer location

AG Hip AG Thigh GE Left

Wrist

GE Right

Wrist

Dominant

Wrist

Non-

dominant

Wrist

RMSE for

predicted total

time spent in

SB

[Minutes (95%

CI)]

7.2

(6.9-7.4)*

6.3

(5.0-6.5)*

2.8

(2.7-2.9)*

3.2

(2.8-3.5)*

3.3

(2.9-3.5)^

2.7

(2.6-2.8)

RMSE for

predicted 5-

second breaks

in SB

[Breaks (95%

CI)]

31.5

(30.7-

32.2)*

21.0

(20.5-

21.6)*

32.4

(31.7-

33.1)*

29.5

(28.9-

30.0)*

30.7

(30.1-31.4)

31.2

(30.6-31.9)

RMSE for

predicted 30-

second breaks

in SB

[Breaks (95%

CI)]

1.6

(1.5-1.6)*

1.5

(1.4-1.5)*

1.9

(1.8-1.9)

1.9

(1.8-1.9)

1.9

(1.8-2.0)

1.9

(1.8-2.0)

RMSE for

predicted 60-

second breaks

in SB

[Breaks (95%

CI)]

2.0

(2.0-2.1)

2.1

(2.0-2.1)

2.1

(2.0-2.2)*

2.2

(2.2-2.3)*

2.2

(2.1-2.2)

2.2

(2.1-2.2)

* indicates significant difference from all other accelerometer placements.

^ indicates significant difference from non-dominant wrist placement.

AG Hip = Hip-mounted ActiGraph monitor, AG Thigh = Thigh-mounted ActiGraph monitor,

GE Left Wrist = GENEA monitor placed on the left wrist, GE Right Wrist = GENEA monitor

placed on the right wrist.

Page 205: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

190

Predictions of breaks in SB compared to DO are shown in Figure 5.3-5.5. From these

figures, it is apparent that choice of break interval for defining SB altered the accuracy of

prediction of breaks in SB in this study. Choice of the five-second interval for defining SB

(Figure 5.3) resulted in dramatic overestimations of breaks in SB for all four accelerometer

placements compared to DO. The thigh accelerometer placement performed best for predicting

breaks in SB for the five-second interval, but it still predicted over five times more breaks in SB

than were actually taken. On the other extreme, use of the 60-second interval for defining a

break in SB (Figure 5.5) resulted in underprediction of breaks by all four accelerometer

placements compared to DO, and none of the predictions were significantly different from each

other. The 30-second interval for defining a break in SB resulted in highest accuracy of

prediction of breaks in SB (Figure 5.4). The thigh and right wrist accelerometer placements

underestimated breaks slightly, by an average of 0.7-0.8 breaks per visit. Conversely, the hip

and left wrist accelerometer placements provided accurate predictions of breaks in SB with the

30-second interval. When analyzed by dominant and non-dominant wrists, the dominant wrist

placement underpredicted breaks, whereas the non-dominant wrist accurately predicted breaks in

SB.

Page 206: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

191

Figure 5.3. Predictions of breaks in SB using a five-second interval.

* indicates significant difference from DO.

^ indicates significant difference from all other accelerometers.

Figure 5.4. Predictions of breaks in SB using a 30-second interval.

* indicates significant difference from DO.

05

101520253035404550

Nu

mb

er o

f B

reak

s in

SB

0

1

2

3

4

5

6

Nu

mb

er o

f B

reak

s in

SB

*^

*

* *

* * * *

*

Page 207: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

192

Figure 5.5. Predictions of breaks in SB using a 60-second interval.

* indicates significant difference from DO.

Table 5.4 shows the RMSE values for predicted vs. measured breaks in SB, displayed

separately for the five-, 30-, and 60-second break intervals. For the five-second break interval,

the poor prediction accuracy for breaks in SB seen in Figure 5.3 was compounded by very high

RMSE values for all four accelerometer placements, ranging from an error of 21.0 breaks for the

thigh placement site to 32.4 breaks for the left wrist placement. The RMSE values for the 30-

and 60-second break intervals were considerably lower than for the five-second break interval.

For all four accelerometer placements, the 30-second break interval had significantly lower

RMSE than the 60-second interval, again indicating superior accuracy for the 30-second break

interval. When comparing among the four accelerometer placements, the hip and thigh

accelerometers had RMSE values 19-30% lower than both wrist placements for the 30- break

interval and 4-9% lower than the wrist placements for the 60-second break interval.

0

1

2

3

4

5

6

Nu

mb

er o

f B

reak

s in

SB

* * * * * *

Page 208: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

193

When comparing the two wrist placements, prediction of total time spent in SB was not

significantly different between the two; however, RMSE for prediction of total time spent in SB

was about 12% lower for the left wrist placement than the right wrist placement (and about 18%

lower for the non-dominant wrist than the dominant wrist). For prediction of breaks in SB, both

dramatically overpredicted breaks using the five-second break interval and underpredicted breaks

using the 60-second break interval. Using the 30-second break interval, the RMSE values were

similar between monitors, but the left wrist prediction of breaks was not significantly different

from DO, whereas the right wrist underpredicted breaks compared to DO. Similarly, breaks

were underpredicted when data were analyzed for the dominant wrist, but the non-dominant

wrist placement resulted in accurate predictions of breaks with the 30-sec break interval.

Page 209: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

194

DISCUSSION

The purpose of this investigation was to develop, validate, and compare the accuracy of

ANNs created to estimate total time spent in SB and breaks in SB from accelerometers located

on the hip, wrists, and thigh. Additionally we compared accuracy of accelerometers worn on the

left and right wrists for prediction of time spent in SB and breaks in SB. The ANNs were

developed in order to predict the type of activity being performed, and these were validated in

our previous work (Chapter 4). For prediction of total time spend in SB, we summed time

predicted as lying, reading, and computer use. Similarly, we predicted breaks in SB as when a

bout of time spent in lying, reading, and computer use was followed by a bout predicted as a

non-sedentary activity.

When examining total time spent in SB, predictions from all four accelerometers were

not significantly different from the criterion, although the hip trended toward underpredicting

time spent in SB. Additionally, the two wrist accelerometer placements had significantly lower

RMSE for predicting total time spent in SB compared to the hip and thigh placements, indicating

that the wrist placement sites had less individual error (and superior accuracy) when predicting

total time spent in SB. The hip accelerometer placement had the worst prediction of total time

spent in SB, with an RMSE value more than 100% greater than those seen with the wrist

placements and 14% higher than the RMSE from the thigh placement. The fact that the hip

placement site performed worst of the four sites in terms of prediction error and the tendency for

underprediction of total time spent in SB is not surprising given previous studies by Kozey

Keadle et al. and Lyden et al. showing higher accuracy for measuring total time spent in SB

using a thigh accelerometer than a hip accelerometer (Kozey-Keadle, Libertine et al. 2011;

Page 210: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

195

Lyden, Kozey Keadle et al. 2012). Additionally, Hart et al. used thigh- and hip-mounted

accelerometers in a free-living setting and found that the thigh placement had higher convergent

validity with other SB assessment measures than the hip placement (Hart, Ainsworth et al. 2011),

again supplying evidence that thigh-mounted accelerometers are preferable to hip accelerometers

for the measurement of total time spent in SB. Of note, the RMSE for the left wrist placement

was 12% lower than the RMSE for the right wrist placement, which increased to an 18%

difference when data were analyzed comparing the dominant and non-dominant wrists. These

findings indicate superior accuracy of the non-dominant wrist accelerometer for measurement of

total time spent in SB. Implications of this finding are discussed later in this section.

To our knowledge, this is the first study that assessed the utility of wrist-mounted

accelerometers for measurement of SB. Initially, we were surprised by the superiority of the

wrist accelerometers to the thigh accelerometer for measurement of total time spent in SB given

the previous literature showing high accuracy of the thigh for measuring SB (Hart, Ainsworth et

al. 2011; Kozey-Keadle, Libertine et al. 2011). However, in our previous work (Chapter 4), the

left and right wrist accelerometer placements achieved activity type classification accuracies of

86.6% and 86.7%, respectively, which was slightly higher than the thigh placement accuracy

(84.0%) and much higher than that accuracy achieved with the hip (72.5%). Moreover, the left

and right wrist placements achieved prediction accuracies of 93.5% and 92.7%, respectively, for

prediction of sedentary activities, which was higher than the thigh (92.1%) and hip (72.5%).

Therefore, the highest overall prediction accuracies for activity type as well as the highest

recognition of sedentary activities supports that wrist-mounted accelerometers may also be best

for prediction of total time spent in SB.

Page 211: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

196

In contrast, the wrist accelerometer placements did not perform superiorly to the hip and

thigh placements for estimating breaks in SB. All four accelerometer placements performed best

when using the 30-second break interval; using this break interval, the wrist placements had

RSME values 19-30% higher than the hip or thigh placements for estimating breaks in SB,

indicating lower measurement accuracy from the wrist-mounted accelerometers. Additionally,

only the non-dominant wrist placement accurately estimated breaks in SB for the 30-second

break interval, with the dominant wrist underpredicting breaks by 17%. The thigh placement

also underpredicted breaks (by 15%) but had the smallest RMSE for prediction of breaks in SB.

Surprisingly, the hip placement performed the best of the four accelerometer placements,

accurately predicting breaks in SB while also yielding an RMSE only 5% higher than the thigh

and 19-23% lower than the wrist accelerometers. The high accuracy of the hip placement is

insightful given mixed results reported by Lyden et al. considering the utility of the hip for

measurement of SB. In one study, Lyden et al. found that a thigh accelerometer was able to

accurately classify breaks in SB, while a hip accelerometer overestimated breaks by 78-133%

depending on choice of cut-point used as the threshold for SB (Lyden, Kozey Keadle et al.

2012). In a more recent study, the authors found that when using ANNs, a hip-mounted

accelerometer was able to accurately measure total time spent in SB as well as breaks in SB

(Lyden, Keadle et al. 2013). The current study, in conjunction with Lyden’s work, provides

further evidence of the advantages of using machine learning for modeling accelerometer data

over using the traditional cut-point approach for measurement of SB using a hip-mounted

accelerometer.

Page 212: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

197

Our finding that the wrist accelerometer placements were outperformed by the hip

placement for measurement of breaks in SB is surprising given that accuracy of activity type

classification, specifically for recognizing SB, is higher for the wrists than the hip (Chapter 4).

There are several possible reasons why higher classification accuracy did not translate to better

measurement of breaks in SB. First, the hip is relatively insensitive to limb movements, whereas

the thigh and wrists are not. Therefore, limb movements while sitting or lying (ex., to drink

water, scratch an itch, adjust equipment/clothing, etc.) may cause misclassification of one or

more -five-second windows as non-sedentary activity. While an occasional misclassification

would have minimal effect on overall classification accuracy or total time spent in SB prediction,

these misclassifications would disrupt periods of SB and therefore lead to incorrect prediction of

a break in SB when one did not occur. It would seem that this type of misclassification would

increase the number of breaks detected and result in overprediction, and this was the case with

the 5-second break interval. However, given the relatively short periods of time some sedentary

activities were performed, it is possible that periodic misclassification due to sporadic limb

movement would keep the accelerometers from recognizing the activity as SB (especially with

longer break windows), thereby not recording the subsequent transition as a break from SB and

leading to underprediction of breaks in SB.

Additionally, for measurement of breaks in SB, the left wrist accelerometer placement

was able to predict breaks accurately with the 30-second break interval, while the dominant wrist

underpredicted breaks by about 17%. Given that 90% of our sample (four of the 39 participants)

was right-hand dominant it is not surprising that upon analyzing the data comparing the

dominant and non-dominant wrists, the non-dominant wrist achieved better accuracy for

Page 213: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

198

measurement of total time in SB and breaks in SB. It may be that the dominant wrist

accelerometer placement captured more irregular movement as participants performed the

various activities in the visit, leading to misclassification of breaks in SB. Given this possibility,

these results lend support to the convention in many large studies, such as NHANES, that wrist

accelerometers should be worn on the wrist of the non-dominant hand (Troiano and McClain

2012; Troiano, McClain et al. 2014).

It is important to reiterate that SB is a complex construct, and it is necessary to be able to

measure the individual components of SB (i.e., total time and breaks) in order to better

understand the influence of SB with health. Studies by Hamilton and colleagues provide

evidence that prolonged SB is worse than an equivalent amount of time spent in SB which is

frequently broken up by periods of non-sedentary activity (Bey and Hamilton 2003; Hamilton,

Hamilton et al. 2004). Additionally, Healy and colleagues have published several studies

showing inverse associations between breaks in SB and several health indices, independent of

total time spent in PA or SB (Healy, Dunstan et al. 2008; Healy, Matthews et al. 2011). Our

findings indicate that total time spent in SB may be best measured using wrist-mounted

accelerometers, while breaks in SB may be better measured by a hip-mounted accelerometer.

Therefore, as more research is conducted to better elucidate the health risks of total time spent in

SB and breaks in SB, choice of accelerometer placement should be determined by the exact

research question of interest.

Page 214: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

199

Strengths and limitations

There were several limitations in this study. First, our sample consisted of mostly college

students with interest in health sciences and may not be reflective of the wider college age/young

adult population. Additionally, the amount of time spent sedentary as well as the number of

breaks in SB by participants during the study protocol is probably not reflective of an average

90-minute segment of the day, so without further research it is not guaranteed that the monitors

will perform with similar accuracy in a true free-living environment.

This study also had a few noteworthy strengths. To our knowledge, this was the first

study to assess the ability of wrist-mounted accelerometers for measurement of total time spent

in SB and breaks in SB, and our use of hip- and thigh-mounted accelerometers allowed for direct

comparison of accuracy of the wrist monitors to previously used methods of measuring SB.

Second, while a simulated free-living setting may not be totally reflective of a true free-living

environment, the simulated free-living setting allows for better generalizability of results than a

heavily controlled, laboratory-based protocol. By utilizing a simulated free-living setting, we

were able to allow some freedom in activity choice, intensity, and timing while still using high-

quality criterion measures and examining of a wide range of different activities in a relatively

short period of time, thereby minimizing burden on participants and researchers.

Conclusions

Our results provide evidence that hip-, thigh-, and wrist-mounted accelerometers can

provide accurate estimates of total time spent in SB, although measurement at the individual

level may be most accurate using the wrist-mounted accelerometers. For measuring breaks in

Page 215: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

200

SB, the 30-second break interval appeared most accurate for all four accelerometers. When

using the 30-second interval, the hip accelerometer performed best, although the left wrist

accelerometer was also able to accurately predict breaks in SB. Together these results indicate

that use of an accelerometer on the non-dominant wrist or the hip may be preferable for

measurement of SB in a free-living setting, although the thigh accelerometer should be evaluated

further due to its demonstrated utility for SB measurement in previous work. Additionally, when

combining these results with the results from Chapters 4 and 5 of this dissertation, it appears that

the wrist-mounted accelerometers (especially the non-dominant wrist accelerometer) perform

well for measurement of energy expenditure and best for classification of activity type and

measurement of SB. Therefore, these results suggest that the wrist may be an ideal measurement

site for measurement of many behavioral characteristics. With the previous and current use of

wrist-mounted accelerometers for sleep measurement, we plan to expand our ANNs to recognize

and classify sleep duration and quality in addition to the variables already assessed.

Additionally, use of wrist-mounted accelerometers may allow researchers to design pattern

recognition approaches to recognize eating behaviors; we plan to further explore this possibility

in future work.

Page 216: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

201

CHAPTER 6

DISSERTATION SUMMARY AND RECOMMENDATIONS

Summary of results

High levels of physical activity (PA) and low levels of sedentary behavior (SB) are

known to be beneficial for improving physical and mental health and lowering the risk of many

chronic diseases (PAGAC 2008). Valid measurement tools are required to accurately assess the

relationship of PA and SB to health outcomes, monitor precise levels of PA or SB to identify

groups of people who are attaining insufficient PA and/or too much SB, and evaluate the

effectiveness of interventions aimed to increase PA and decrease SB. Accelerometers are

commonly used for prediction of energy expenditure, activity type (to determine PA participation),

and SB, but the models used to predict these outcomes vary considerably in their complexity and

accuracy. Therefore the purposes of this dissertation were to 1) create predictive models from

accelerometer data with the intent to predict energy expenditure, activity type, and SB, 2) compare

the accuracy of models created from accelerometers worn on the right hip, right thigh, and both

wrists, and 3) to develop and test the models created using simple input features and widely

available computational software.

Chapter 3: Estimation of energy expenditure

The first part of our investigation focused on the ability of the four accelerometer

placements to accurately estimate EE. We hypothesized that all four placements would achieve at

least moderately high accuracy for predicting EE, as indicated by correlations of r ≥ 0.60 (Safrit

and Wood 1995) . The four placements achieved correlations of r = 0.82-0.89 with measured EE

Page 217: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

202

from the Oxycon, supporting our hypothesis and indicating high accuracy for prediction of EE

from all four placements. Root mean square error (RMSE) was also calculated and ranged from

1.05-1.42 METs, which fall in line with values seen in previous work. When comparing

placement sites, we hypothesized that the thigh location would show the highest EE prediction

accuracy. This hypothesis was supported, with the thigh accelerometer achieving higher

correlations and lower RMSE for predicting EE than the hip or wrist accelerometer placements.

Another important advantage of the thigh-mounted accelerometer over the other placements was

that the use of fewer input features in the EE prediction model (which reduces its complexity) did

not result in lower accuracy, whereas the predictive accuracy was lower when fewer features were

used with the hip and two wrist accelerometer models. These findings lend support to the use of

thigh-mounted accelerometers for achieving high predictive accuracy for measuring EE, even with

relatively simple prediction models. However, the superiority of the thigh accelerometer

placement should not overshadow the fact that both the hip and two wrist accelerometer

placements also achieved highly accurate predictions of EE.

One significant hurdle in assessing EE in free-living settings is choice of a criterion

measure. Doubly labeled water is often used as a criterion measure for total EE but cannot assess

minute-to-minute EE. As an alternate approach, Lyden et al. used direct observation as a criterion

measure of free-living EE by recording the activity performed and then looking up an EE value

from the Compendium of Physical Activities in order to predict EE for each activity (Ainsworth,

Haskell et al. 2011; Lyden, Keadle et al. 2013). A potential problem with this approach is that the

Compendium represents an average value of EE for activities and is not necessarily accurate for a

given individual, especially when the observer would have to record the activity and also estimate

Page 218: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

203

the activity intensity. Additionally, this method does not allow for prediction of EE during

transitions between activities since a transition is not a defined activity type but instead is used to

classify times when a person moves from one activity to another. Given the limitations of these

methods, we chose to use indirect calorimtery via a portable metabolic analyzer as our criterion

measure, which measures oxygen consumption to derive estimates of EE. Use of this method

allowed us to record data during all activity times as well as during transitions.

Indirect calorimetry provides a valid measure of EE when a person performs steady-state

activities (Rosdahl, Gullstrand et al. 2010); however, when a person changes activities or moves to

a different intensity of activity, change in oxygen consumption lags behind, meaning that indirect

calorimetry may not capture the true energy requirement of a task unless the task is being

performed at steady state, which may take several minutes to achieve after an activity is started

(Kenney, Wilmore et al. 2012). In our study, participants performed 14 distinct activities but could

perform an activity more than once; the actual number performed ranged from 14-20 and averaged

about 16, with an average length of about five minutes per activity. Therefore, a significant portion

of time during the protocol was likely not spent in steady-state EE. Despite these shortcomings,

we deemed indirect calorimetry the best available criterion measure due to the limitations of

doubly labeled water and direct observation (discussed earlier).

The lack of steady-state EE seen in our study likely relates to true free-living situations, at

least for PA. In free-living settings, adults likely reach steady-state EE during SB since SB makes

up the majority of waking time and since most SB bouts are performed for a prolonged period of

time (i.e., > 10 minutes) (Matthews, Chen et al. 2008; Lyden, Kozey Keadle et al. 2012).

However, non-sedentary activities make up a much smaller portion of the day and are generally

Page 219: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

204

performed in shorter bouts, especially with respect to higher-intensity activities (Troiano, Berrigan

et al. 2008); therefore, we expect that steady-state is rarely achieved during free-living PA.

Accordingly, we feel that more research and discussion is needed to develop ways of improving

the use of direct observation and/or indirect calorimetry for measurement of non-steady-state EE.

One potential idea would be to perform a similar protocol to ours but to add a second visit where

each participant can perform each activity at steady state while their EE is measured via indirect

calorimetry. Then, for the simulated free-living visit, direct observation could be used as the

criterion (similar to Lyden’s study), but the individual’s measured EE values from the first visit

could be used to predict EE instead of using the Compendium for prediction of EE. This approach

would likely increase validity of direct observation but would also significantly increase participant

and research burden and cost of the study. However, we feel that our use of indirect calorimetry

represented an appropriate criterion measure to answer our research questions and are confident

that our results provide an accurate reflection of the true utility of the four accelerometer

placements we tested for prediction of EE.

In conclusion, the thigh accelerometer performed best of the placement sites for prediction

of EE, and the superiority of the thigh was more apparent with the simplest ANNs. However, the

wrists and hip placements achieved correlations within 10% and error within 25% of that achieved

by the thigh placement, indicating that high accuracy can also be achieved for measurement of EE

using accelerometers placed on the hip and wrists. Therefore, thigh-mounted accelerometers

should be used if EE measurement accuracy is of utmost importance, but the hip and wrists can be

used for accurate measurement as well, if these placement sites are more practical for the

population being tested or for the specific research question being addressed.

Page 220: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

205

Chapter 4: Classification of activity type

The second major aim of this dissertation was evaluating the ability of accelerometers

located on the hip, thigh, and wrists to correctly predict the specific type of activity being

performed. Our first aim in this study was to create models to predict activity type using simple

input features and widely available, easy-to-use software packages. We were successful in

accomplishing this goal by using Microsoft Excel for data processing, cleaning, and reduction and

R for ANN creation. Our first hypothesis-driven aim was to compare overall classification

accuracies among the four accelerometers as well as compare accuracies for detecting specific

types of activities. From our results shown in Chapter 3 as well as in previous research by

members of our research group and others (Cleland, Kikhia et al. 2013; Dong, Montoye et al.

2013; Skotte, Korshoj et al. 2014), we hypothesized that the thigh accelerometer would achieve the

highest overall activity classification accuracy. However, our results did not support this

hypothesis. When comparing classification accuracies for identifying all 14 activities, the two

wrist accelerometers performed the best, with classification accuracies of 81.3-81.4%. They also

showed the highest sensitivity and specificity for activity classification accuracy, whereas the thigh

and hip accelerometers achieved accuracies of only 71.7% and 66.4%, respectively. When

grouping similar activities into categories, the accuracies of all four monitors improved; the wrist

accelerometers still had the highest classification accuracies at 86.6-86.7%, with the thigh being

much closer in accuracy (84.0%) than the hip (72.5%).

When looking at the classification accuracies of specific activity types, we hypothesized

correctly that the wrist accelerometers would have the highest classification accuracies for lifestyle

activities (laundry and sweeping) as well as other upper-body activities such as biceps curls. The

Page 221: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

206

wrist accelerometers also achieved the highest accuracy for classifying sedentary activities, which

we hypothesized would be measured best with the thigh-mounted accelerometer given high

accuracy for SB measurement by thigh-mounted accelerometers seen in previous research (Kozey-

Keadle, Libertine et al. 2011; Lyden, Kozey Keadle et al. 2012). However, the seated activities in

this study (computer use and reading) involved arm movement, which we believed would be

detected better with the wrist accelerometers than the thigh accelerometer. Importantly, combining

the sedentary activities into one category resulted in the thigh achieving an overall classification

accuracy within 1.5% of that achieved by the wrists, providing evidence that even if the thigh

cannot accurately classify specific types of sedentary activity (ex. lying vs. sitting), the thigh is

highly accurate for differentiating SB from non-sedentary activities. Our findings contrast

somewhat to previous work showing comparable or higher measurement accuracy of hip and thigh

accelerometers (Cleland, Kikhia et al. 2013; Dong, Montoye et al. 2013; Skotte, Korshoj et al.

2014). However, the current study tested a larger number of activities, and they occurred in a

simulated free-living setting, which can yield very different results compared to those found in a

laboratory-based setting (Gyllensten and Bonomi 2011; Lyden, Keadle et al. 2013; van Hees,

Golubic et al. 2013). In addition, the current study design is more generalizable to true free-living

settings than the findings from previous work.

We also compared classification accuracies of monitors worn on the left and right wrists.

Since about 10% of the sample was left-hand dominant, we also analyzed the data comparing the

dominant and non-dominant wrists. In both analyses, the classification accuracies of the monitors

on the wrists were within 1.0%, signifying similar accuracy of measurement regardless of the wrist

on which the accelerometer was worn. This finding suggests that the popular convention to wear

Page 222: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

207

an accelerometer on the non-dominant wrist may be unnecessary for prediction of activity type,

especially if compliance will be improved by allowing wearers to choose the wrist on which to

wear the accelerometer (although our findings do not support that the wearer can switch between

wrists within a study).

Comparison of classification accuracies achieved among different studies is notoriously

difficult because classification accuracy is inversely related to activity number and similarity

among activities (all else held equal). Therefore, studies comparing the utility of different

accelerometer placement sites must directly compare each placement site. We chose to test the

hip, thigh, and wrists because they are the three most commonly used accelerometer placement

sites, but other sites, such as the ankle or lower back, may have advantages in certain situations and

should be considered for use in future studies.

Another difficulty of activity type classification studies is choice of activities to include in

the testing set. Predictive models can only predict activities that were used in the model creation;

for example, the models created in this dissertation can predict laundry but have no output variable

for gardening or dishwashing. When creating models to recognize specific activity types, there is

no way to include all activities that people may perform in their everyday lives. However, by

collapsing activities into categories comprising similar activities, it is possible to develop an idea of

how people spend their days and how active they are. The ANNs developed in this study showed

an ability to classify 10 categories of activities with sensitivities from 72.2-86.7%. Further

reduction to identification of activity intensities improved the sensitivity and AUC for the thigh

accelerometer placement and resulted in high classification accuracy by the thigh and wrist

placement sites and good classification accuracy by the hip placement site (Metz 1978).

Page 223: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

208

In free-living settings, adults perform a wide variety of activities not included in this study;

therefore, we expect that the capability of our ANNs for prediction of specific types of activities

will be decreased in free-living settings. However, we demonstrated high predictive accuracy by

the thigh- and wrist-mounted accelerometer placements when collapsing our prediction into either

activity categories or activity intensities, and we feel that this approach is much more generalizable

because even activities not tested in this study can be grouped into an activity category or intensity

in order to measure activity levels in a free-living setting.

Along with the discussion above, the importance of being able to classify specific activities

vs. activity categories (i.e., lifestyle, exercise, etc.) vs. activity intensities (i.e., sedentary, light, etc.)

will depend on the question of interest. For example, a physical therapist may be interested in

measuring specific types of exercise activities to gauge compliance in a rehabilitation program,

necessitating the differentiation of specific exercise activities. Alternatively, a mother might want

to be able to differentiate between her child’s reading and TV watching but might be happy with

any type of exercise or ambulatory activity. From a health behavior perspective, it is necessary to

recognize specific types of ambulatory activity to differentiate incidental activity with health-

enhancing activity. Specific exercise activities may not be as important to differentiate unless

dictated by a specific research question. Lifestyle activities and standing are likely most important

from an energy balance perspective or as breaks in SB. Lastly, from a pure health standpoint,

recognition of specific types of SB may not be as important; however, from an intervention

perspective, recognition of specific types of SB may be critical because getting someone to watch

less TV may require different techniques than getting someone to drive less or sit less at work.

Page 224: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

209

In conclusion, our study supports that classification of a variety, but not all possible types,

of sedentary, ambulatory, lifestyle, and exercise activities was measured most accurately with

accelerometers placed on the left or right wrists, especially if classification of specific types of

activities is of importance. When activities were combined into similar categories, the thigh

accelerometer classification accuracy approached that achieved by the wrists, but the wrists

remained superior. Conversely, when classifying activities by activity intensities, the thigh

placement slightly outperformed the two wrist placements, although all three of these sites

achieved high overall intensity classification accuracy. These findings may vary depending on the

choice of activities included in a validation protocol; however, from these findings, it appears that

upper-body movements may be more unique to an activity than lower-body movements, allowing

for better recognition of activities when using simple input features for an ANN created from

wrist-accelerometer data.

Chapter 5: Estimation of sedentary behavior

The final objective of this dissertation was to assess the ability of hip, thigh, and wrist

accelerometers to accurately predict total time spent in SB as well as breaks in SB. Our first aim

of this study was to compare the accuracy of the hip, thigh, and wrist accelerometers for

measurement of total time spent in SB. We expected the thigh accelerometer to provide the most

accurate estimates, whereas we hypothesized that the hip would overestimate total SB due to

misclassification of standing as SB and that the wrist would underestimate total SB due to

misclassification of SB as a non-sedentary activity (due to aberrant wrist movement during SB).

Overall, all four accelerometer placements provided similar predictions of total time spent in SB,

but measurement error was considerably higher for the hip than the thigh, and the thigh had greater

Page 225: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

210

error than the two wrist accelerometers. The finding that the hip had higher error than the thigh

supports previous work by Lyden, Kozey-Keadle, and Grant (Grant, Ryan et al. 2006; Kozey-

Keadle, Libertine et al. 2011; Lyden, Kozey Keadle et al. 2012). Our finding of superior accuracy

of the left and right wrist accelerometer placements was contrary to our initial hypothesis, but not

overly surprising, since the ANNs used to predict total SB were the same as the ones used to

classify activity type in Chapter 4, where the wrists outperformed the hip and thigh for overall

activity recognition as well as recognition of SB.

The second aim of the study was to compare the accuracy of the four accelerometers for

estimating breaks in SB. We used three different break intervals (5-, 30-, and 60-seconds) for

classifying a break in SB in order to determine an optimal break interval for measurement as it is

currently unknown what interval is best suited for recognizing breaks in SB . We found that the 5-

second interval was too short, with the misclassification of single windows of accelerometer data

resulting in dramatic overpredictions of breaks in SB by all four accelerometers. Conversely, the

60-second break interval appeared to be too long and resulted in underprediction of breaks in SB

by all four accelerometers. Using the 30-second interval, the hip and left wrist accelerometers

predicted breaks in SB accurately, while the thigh and right wrist accelerometers underpredicted

breaks. However, error in SB break prediction was lowest with the thigh and highest with the wrist

accelerometers. These findings were unexpected given the superior accuracy of the wrists for

predicting total time spent in SB; however, the findings point to the importance of measuring these

two constructs separately. Even though total time in SB and breaks in SB are related, accurate

measurement of one does not imply accurate prediction of the other. The hip accelerometer

placement’s ability to measure SB breaks accurately may be due to its insensitivity to limb

Page 226: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

211

movement, whereas the wrists and thigh may have detected limb movement and potentially

misclassified SB as a non-sedentary activity, therefore misclassifying breaks is SB. Previous

research shows mixed findings of the accuracy of hip accelerometers for measurement of breaks in

SB, but a recent study by Lyden et al. highlights dramatically improved measurement of SB using

machine learning in comparison to the cut-point approach (Lyden, Keadle et al. 2013). Therefore,

our study provides further support that machine learning may allow for improved measurement of

SB using a hip accelerometer.

Interestingly, we found that the left and right wrist accelerometer placements performed

with similar accuracy when predicting total time spent in SB, but the left wrist showed higher

accuracy for prediction of SB breaks. Given that 90% of our sample was right-hand dominant, our

findings indicate potential superiority of the non-dominant wrist for measurement of SB, which is

in accordance with the convention for accelerometers to be placed on the non-dominant wrist.

We chose to predict time spent in SB and breaks in SB by first classifying into specific

types of activity. An alternate way to classify SB would be to use an EE of < 1.5 METs as SB.

Breaks would occur any time a predicted EE of < 1.5 METs was followed by an EE of ≥ 1.5

METs. This approach is how cut-point methods have been used, but the major drawback of this

method is that standing is an activity that typically elicits an EE of < 1.5 METs but is defined as a

non-sedentary activity by Owen et al. (Owen, Healy et al. 2010) due to evidence that standing may

not have the same health implications activities such as sitting or lying (Katzmarzyk 2014).

Therefore, we determined that this method was inappropriate since it would likely misclassify

standing as SB.

Page 227: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

212

In conclusion, the findings for prediction of total time spent in SB closely mirrored our

findings from Chapter 4 showing highest accuracy for the wrist accelerometers and lowest

accuracy for the hip accelerometer, although all four accelerometers provided similar estimations

of SB breaks. The results for prediction of breaks in SB were more mixed but indicated that the

hip was superior for measurement of breaks in SB. Further work is needed to confirm these

findings as there is limited and potentially conflicting research regarding the utility of different

accelerometer placement sites for measurement of SB.

Conclusions

This dissertation provides a comparison of the utility of accelerometers placed on the hip,

thigh, and wrists and machine learning models for measurements of three key behavioral

variables (energy expenditure, activity type recognition, and sedentary behavior) which are

important determinants for long-term health at an individual and population level. We sought to

determine if accelerometer placement affected measurement accuracy and if an optimal

placement existed for measurement of all three variables. Our study suggests that choice of

placement site affects measurement accuracy. Each outcome variable had a different optimal

placement site, with the thigh being best for energy expenditure, the wrists being best for activity

type classification, and the hip and right wrist being best for measurement of SB, although the

SB findings were somewhat mixed. Additionally, although one placement site was not best for

all measures, all placement sites allowed for high accuracy of measurement of energy

expenditure, the wrists and thigh achieved over 80% accuracy for activity type classification, and

all four monitors showed strengths and weakness for measurement of SB. Given these findings

along with those of previous work, it seems that choice of accelerometer placement should

Page 228: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

213

depend on the specific research questions, the population being tested, the length of time monitors

are to be worn, and the complexity of the models desired.

In an effort to compare the accuracy of our ANNs for measurement of energy expenditure

vs. measurement of activity type, we have provided Table 6.1 below, which shows the

sensitivity, specificity, and AUC of the energy expenditure ANNs for their accuracy in

predicting activity intensity (similar to Table 4.9). With the activity type ANNs, AUC for

activity intensity was as low as 0.85for the hip accelerometer placement (indicating good

accuracy) and as high as 0.94 for the thigh accelerometer placement (indicating high accuracy).

Conversely, the activity intensity AUC was much lower for the energy expenditure ANNs, with

AUC values of 0.75-0.76 for the hip and wrist placements and 0.79 for the thigh placement,

indicating only fair accuracy for the four placement sites. Therefore, it appears that in terms of

determining activity intensity, the activity type ANNs may be superior to the energy expenditure

ANNs for all accelerometer placement sites tested.

Page 229: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

214

Table 6.1. Overall sensitivity, specificity, and AUC among the four accelerometer placement sites for classification of activity

intensity using the energy expenditure ANNs (developed in Chapter 3).

Sensitivity (% agreement) Specificity (%) AUC

AG

Hip

AG

Thigh

GE

Left

Wrist

GE

Right

Wrist

AG

Hip

AG

Thigh

GE

Left

Wrist

GE

Right

Wrist

AG

Hip

AG

Thigh

GE

Left

Wrist

GE

Right

Wrist

Sedentary 64.1

(7.3)*

70.4

(6.9)*

58.7

(7.5)*

51.8

(7.6)*

93.0

(3.9)

90.7

(4.4)

93.1

(3.8)

93.6

(3.7)

0.79

(0.01)*

0.81

(0.01)*

0.76

(0.01)*

0.73

(0.01)*

Light 60.2

(6.3)*

65.2

(6.1)

66.2

(6.1)

66.8

(6.0)

78.0

(5.3)

82.9

(4.8)*

77.5

(5.4)

74.6

(5.6)

0.69

(0.01)*

0.74

(0.01)*

0.72

(0.01)*

0.71

(0.01)*

Moderate 69.1

(6.7)

72.3

(6.5)

74.4

(6.3)

72.6

(6.5)

82.1

(5.6)

87.9

(4.7)*

83.1

(5.4)

83.4

(5.4)

0.76

(0.01)*

0.80

(0.01)*

0.79

(0.01)*

0.78

(0.01)*

Vigorous 70.6

(9.0)

75.8

(8.4)

62.2

(9.6)*

65.6

(9.4)*

97.5

(3.1)

96.5

(3.6)

98.0

(2.7)

97.8

(2.9)

0.84

(0.02)*

0.86

(0.02)*

0.80

(0.02)*

0.82

(0.02)*

MVPA 83.8

(4.3)

88.1

(3.8)

86.0

(4.0)

85.2

(4.1)

84.1

(4.3)*

90.2

(3.5)*

87.3

(3.9)

86.7

(4.0)

0.84

(0.01)*

0.89

(0.01)*

0.87

(0.01)*

0.86

(0.01)*

Total 65.1

(3.6)

69.9

(3.4)*

66.0

(3.6)

64.4

(3.6)

85.6

(2.6)

88.1

(2.4)*

85.8

(2.6)

85.0

(2.7)

0.75

(0.01)

0.79

(0.01)*

0.76

(0.01)*

0.75

(0.01)

Values are shown as Mean (SD). The * indicates significant differences from all other accelerometer placement sites.

Page 230: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

215

Current use of machine learning suffers from several pitfalls that this dissertation sought

to address. First, machine learning models are often built by engineers or computer scientists

who have an understanding for model building far beyond that of the average physical activity

researcher. The associated complexity of many machine learning models limits or prohibits their

use by physical activity researchers. The artificial neural networks created in this dissertation

were built with simple input variables that can easily be calculated in Microsoft Excel.

Additionally, we used pre-written R code for development and testing of our models, therefore

accomplishing our goal of making artificial neural network creation understandable and

accessible to non-experts.

Also, validation studies are often conducted in laboratories under strictly controlled

protocols that require activities to be performed at a constant intensity for a defined period of

time and for a specific order. These laboratory conditions are not similar to how people actually

act in a free-living environment, and previous research shows consistent drops in performance

when laboratory-validated techniques are applied to free-living situations (Gyllensten and

Bonomi 2011; Lyden, Keadle et al. 2013; van Hees, Golubic et al. 2013). Therefore, we allowed

participants considerable freedom in our protocol to make our setting as similar to a free-living

environment as possible while still keeping the visit short and having participants perform all 14

activities.

The dissertation results provide several important advances to the field of physical

activity and sedentary behavior measurement. First, we have improved measurement of energy

expenditure using a single accelerometer far beyond what has been achieved using count-based

regression and cut-points. With our energy expenditure models, it is possible to determine a

Page 231: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

216

person’s daily kcal expenditure and therefore provide valuable information relevant to

interventions such as weight loss. In addition, total daily energy expenditure can be used as a

measure of total activity level in order to determine relationships with specific health outcomes.

Alternately, by using three METs as the threshold for MVPA, we can use the energy expenditure

models to determine daily MVPA levels, measure adherence to meeting the national physical

activity recommendations, and identify individuals or groups accumulating inadequate physical

activity or excessive sedentary behavior. Second, our activity type models are useful for

determining times of the day when participants are most/least active in addition to knowing how

much time they spend in certain behaviors. This information is important for individuals who

tailor specific intervention strategies to help people become more active and less sedentary.

Also, it may help for determining associations of specific behaviors (i.e., standing) with health

outcomes.

Lastly, the emphasis on accurate sedentary behavior measurement in this dissertation was

warranted given the current lack of a measurement tool that is valid for assessment of sedentary

behavior as well as physical activity. A major purpose of this dissertation was to determine an

optimal method for measuring total time spent in sedentary behavior and breaks in sedentary

behavior since an accurate sedentary behavior measurement tool will facilitate further research

into the health risks of sedentary behavior and allow for evidence-based recommendations to be

developed regarding healthy levels of sedentary behavior. We believe that this dissertation

offers a fairly accurate measure of sedentary behavior, but there is room for improvement in this

measure.

Page 232: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

217

It would be ideal if one accelerometer placement performed best for all variables of interest

because that would allow for recommendation of a single monitor placement, but this did not

occur. However, if we were to pick one accelerometer placement based on the results of this

dissertation, we would choose an accelerometer placed on the non-dominant wrist. This placement

showed the highest accuracy for activity type prediction and achieved high overall measurement

accuracy for energy expenditure and sedentary behavior. The dominant wrist also performed well

for activity type and energy expenditure prediction but with lower accuracy for prediction of

sedentary behavior. The thigh placement also performed well overall, but the wrist-mounted

accelerometers were more comfortable and convenient for wear and still yielded high measurement

accuracy. A good blend of practicality and accuracy is often desired for measurement tools used in

large epidemiologic, surveillance, or intervention studies. Additionally, it appears that more

accelerometer features may improve measurement accuracy, but simpler feature sets can still

provide high accuracy while simplifying the predictive models. From our results, we would

recommend the feature set consisting of the five accelerometer percentiles (10th

, 25th, 50

th, 75

th, and

90th), which has been used in previous work and also showed high measurement accuracy in this

dissertation.

Results of this dissertation encourage further exploration of accurate yet relatively simple

ways of using accelerometers to measure several important behavioral variables known to

influence health. Below, we have outlined future directions for exploration of sedentary behavior

measurement as well as other areas that build off the findings from this dissertation.

Page 233: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

218

Recommendations for future research

From the findings of this dissertation, we have several recommendations for further

research. These recommendations are discussed below.

1. Further research should be conducted evaluating the accuracy of the hip, thigh, and wrist

accelerometer placement sites for measurement of sedentary behavior. Evidence is

emerging that sedentary behavior is an important health determinant, necessitating further

refinement of measurement tools which can accurately measure the various aspects of

sedentary behavior (total time, breaks, etc.) to better understand its health effects and

determine evidence-based recommendations for limiting sedentary behavior to improve

health. We feel that the models we developed for measurement of sedentary behavior

provided good accuracy but can be improved; some suggested areas for experimentation

include use of different input features and machine learning techniques that may be better

suited specifically for differentiating movement from non-movement.

2. The testing of the wrist, hip, and thigh placement sites should be expanded to a more

diverse population. Children and older adults have very different movement patterns and

physical activity levels (Bailey, Olson et al. 1995; Troiano, Berrigan et al. 2008), and it

may be that certain placements may have advantages in these different populations.

Additionally, overweight/obese or pregnant populations often feel uncomfortable while

wearing hip accelerometers (Feito, Bassett et al. 2011), so wrist and thigh placements

should be tested in these populations to determine if these are sufficient alternative sites

for accelerometers to be worn.

Page 234: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

219

3. Accelerometer placement sites should be qualitatively and quantitatively evaluated for

wear preference, and compliance data for each site should be assessed. There is

preliminary information from NHANES that the wrist accelerometer placement has

slightly higher compliance than the hip accelerometer (Troiano and McClain 2012), but

these findings must be verified and expanded.

4. Machine learning algorithms other than artificial neural networks should be utilized in

model creation. Artificial neural networks are being studied more thoroughly, show high

measurement accuracy, and are easier to compute using R software than many other types

of algorithms, but they are also much more computationally inefficient than other

algorithms. Additionally, other algorithms may be able to achieve higher measurement

accuracy than artificial neural networks (Preece, Goulermas et al. 2009). These

possibilities should be explored in future work.

5. The simulated free-living setting used in this study was a significant study strength, and

the results are likely more generalizable than those achieved in laboratory-based settings.

However, a simulated free-living setting does not provide a perfect representation of the

true free-living environment; thus, the artificial neural networks created in this study

should be evaluated in a true free-living setting.

6. Further work should be done to determine the optimal criterion measure for use in free-

living measurement of energy expenditure. We chose to use indirect calorimetry as the

criterion for this dissertation, even though the majority of time was likely spent not in

steady-state.

Page 235: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

220

7. This dissertation provided preliminary validation of artificial neural networks developed

from accelerometer data to detect several behavioral variables, including energy

expenditure, recognition of common activities performed, and sedentary behavior.

Others have used accelerometers (usually placed on the wrist or hip) for measurement of

sleep quality and quantity (Jean-Louis, Kripke et al. 2001), both of which have known

associations with many health indices (Hoevenaar-Blom, Spijkerman et al. 2011).

Several proprietary activity monitors such as the Fitbit® or Fuelband® are designed to

monitor both activity and sleep, but these have questionable accuracy, and we are

unaware of a research-grade device that has been validated to accurately measure both

sleep and activity variables. We would like to expand the use of the machine learning

algorithms developed and validated in this dissertation to measure sleep quantity and

quality.

8. One important finding of this dissertation is that upper-body activities and specific

sedentary behaviors are detected well by wrist-mounted accelerometers. Diet is a

notoriously difficult variable to measure, and one reason for this difficulty is that diet is

most often subjectively recalled via diary, interview, or food frequency questionnaire

(Thompson and Subar 2013). Two objective methods exist to measure diet, direct

observation and blood-based biomarkers, but direct observation is likely to cause

reactivity and blood biomarkers are only useful for some nutrients and not overall diet

quality (Park, Vollset et al. 2013). An interesting potential application of machine

learning and pattern recognition would be to attempt to detect when someone is eating

using acceleration data from a wrist accelerometer. Eating is typically a seated, sedentary

Page 236: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

221

behavior with predicable arm movement; these characteristics give us reason to believe

that eating could be recognized using a wrist-mounted accelerometer. This approach may

not be able to yield accurate estimates of diet quality or quantity of foods consumed, but

it could provide valuable information about eating behaviors such as frequency and

timing of meals. Also, there may be ways to use this information as feedback to the

wearers to improve subjective recall of eating behaviors and to combine physical activity

and eating behavior assessment to provide more accurate and focused health information.

Page 237: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

222

APPENDICES

Page 238: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

223

APPENDIX A

Consent form

Page 239: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

224

Page 240: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

225

Page 241: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

226

Page 242: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

227

APPENDIX B

Recruitment flyer

Figure B.1. Recruitment flyer

Page 243: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

228

APPENDIX C

Recruitment email

Page 244: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

229

APPENDIX D

Supplemental figures

Figure D.1. Equipment worn by participants during the 90-min protocol. Participant shown is

performing the lying activity (T1).

Page 245: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

230

Figure D.2. Example of participant performing reading activity (T2).

Page 246: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

231

Figure D.3. Example of participant performing computer use activity (T3).

Page 247: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

232

Figure D.4. Example of participant performing standing activity (T4).

Page 248: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

233

Figure D.5. Example of participant performing laundry activity (T5).

Page 249: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

234

Figure D.6. Example of participant performing sweeping activity (T6).

Page 250: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

235

Figure D.7. Example of participant performing walking slow and fast activities (T7 and T8).

Page 251: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

236

Figure D.8. Example of participant performing jogging activity (T9).

Page 252: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

237

Figure D.9. Example of participant performing cycling activity (T10).

Page 253: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

238

Figure D.10. Example of participant performing stair use activity (T11).

Page 254: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

239

Figure D.11. Example of participant performing biceps curls activity (T12).

Page 255: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

240

Figure D.12. Example of participant performing squats activity (T13).

Page 256: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

241

Figure D.13. Example of non-wear (T14).

Page 257: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

242

REFERENCES

Page 258: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

243

REFERENCES

"R Core Development Team. R: A language and Environment for Statistical Computing. version

2.12.1."

(2008) "Physical Activity Guidelines Advisory Committee: 2008 Physical Activity Guidelines for

Americans."

(2008). "US Department of Health and Human Services. 2008 physical activity guidelines for

Americans." from http://www.health.gov/PAGuidelines/.

ACSM (2009). ACSM's Guidelines for Exercise Testing and Prescription, Lippincott Williams &

Wilkins.

ActiGraph. (2013). "Products: GT3X+ Monitor." from

http://www.actigraphcorp.com/products/gt3x-monitor/.

Ainsworth, B. E., W. L. Haskell, et al. (2011). "2011 Compendium of Physical Activities: a second

update of codes and MET values." Medicine and science in sports and exercise 43(8):

1575-1581.

Ainsworth, B. E., W. L. Haskell, et al. (2011). "2011 Compendium of Physical Activities: a second

update of codes and MET values." Med Sci Sports Exerc 43(8): 1575-1581.

Akkermans, M. A., M. J. Sillen, et al. (2012). "Validation of the oxycon mobile metabolic system

in healthy subjects." Journal of sports science & medicine 11(1): 182-183.

Albinali, F., S. Intille, et al. (2010). Using Wearable Activity Type Detection to Improve Physical

Activity Energy Expenditure Estimation. ACM Conference on Ubiquitous Computing.

Denmark: 311-320.

Aminian, S. and E. A. Hinckson (2012). "Examining the validity of the ActivPAL monitor in

measuring posture and ambulatory movement in children." The international journal of

behavioral nutrition and physical activity 9: 119.

Andre, D. and D. L. Wolf (2007). "Recent advances in free-living physical activity monitoring: a

review." Journal of diabetes science and technology 1(5): 760-767.

Arvidsson, D., F. Slinde, et al. (2007). "Energy cost of physical activities in children: validation of

SenseWear Armband." Medicine and science in sports and exercise 39(11): 2076-2084.

Page 259: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

244

Atkin, A. J., T. Gorely, et al. (2012). "Methods of Measurement in epidemiology: sedentary

Behaviour." International journal of epidemiology 41(5): 1460-1471.

Ayabe, M., H. Kumahara, et al. (2013). "Epoch length and the physical activity bout analysis: an

accelerometry research issue." BMC research notes 6: 20.

Bailey, R. C., J. Olson, et al. (1995). "The level and tempo of children's physical activities: an

observational study." Med Sci Sports Exerc 27(7): 1033-1041.

Bailey, R. C., J. Olson, et al. (1995). "The level and tempo of children's physical activities: an

observational study." Medicine and science in sports and exercise 27(7): 1033-1041.

Bao, L. and S. S. Intille (2004). "Activity recognition from user-annotated acceleration data."

Proceedings of PERVASIVE 2004 LNCS 3001: 1-17.

Beaton, G. H., J. Milner, et al. (1979). "Sources of variance in 24-hour dietary recall data:

implications for nutrition study design and interpretation." Am J Clin Nutr 32(12): 2546-

2559.

Bergouignan, A., F. Rudwill, et al. (2011). "Physical inactivity as the culprit of metabolic

inflexibility: evidence from bed-rest studies." Journal of applied physiology 111(4): 1201-

1210.

Berntsen, S., R. Hageberg, et al. (2010). "Validity of physical activity monitors in adults

participating in free-living activities." British journal of sports medicine 44(9): 657-664.

Berntsen, S., S. N. Stafne, et al. (2011). "Physical activity monitor for recording energy

expenditure in pregnancy." Acta obstetricia et gynecologica Scandinavica 90(8): 903-907.

Bey, L. and M. T. Hamilton (2003). "Suppression of skeletal muscle lipoprotein lipase activity

during physical inactivity: a molecular reason to maintain daily low-intensity activity." The

Journal of physiology 551(Pt 2): 673-682.

Bird, A. D. (1972). "The effect of surgery, injury, and prolonged bed rest on calf blood flow." The

Australian and New Zealand journal of surgery 41(4): 374-379.

Blair, S. N. (1993). "Evidence for success of exercise in weight loss and control." Annals of

internal medicine 119(7 Pt 2): 702-706.

Bonomi, A. G., G. Plasqui, et al. (2009). "Improving assessment of daily energy expenditure by

identifying types of physical activity with a single accelerometer." Journal of applied

physiology 107(3): 655-661.

Page 260: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

245

Boone, J. E., P. Gordon-Larsen, et al. (2007). "Screen time and physical activity during

adolescence: longitudinal effects on obesity in young adulthood." The international journal

of behavioral nutrition and physical activity 4: 26.

Bouten, C. V., A. A. Sauren, et al. (1997). "Effects of placement and orientation of body-fixed

accelerometers on the assessment of energy expenditure during walking." Medical &

biological engineering & computing 35(1): 50-56.

Brage, S., N. Brage, et al. (2005). "Reliability and validity of the combined heart rate and

movement sensor Actiheart." European journal of clinical nutrition 59(4): 561-570.

Brage, S., N. Brage, et al. (2003). "Reliability and validity of the Computer Science and

Applications accelerometer in a mechanical setting." Measurement in Physical Education

and Exercise Science 7: 101-119.

Brage, S., N. Wedderkopp, et al. (2003). "Reexamination of validity and reliability of the CSA

monitor in walking and running." Medicine and science in sports and exercise 35(8): 1447-

1454.

Brownson, R. C., C. M. Hoehner, et al. (2009). "Measuring the built environment for physical

activity: state of the science." American journal of preventive medicine 36(4 Suppl): S99-

123 e112.

Carr, L. J. and M. T. Mahar (2012). "Accuracy of intensity and inclinometer output of three

activity monitors for identification of sedentary behavior and light-intensity activity."

Journal of obesity 2012: 460271.

Celis-Morales, C. A., F. Perez-Bravo, et al. (2012). "Objective vs. self-reported physical activity

and sedentary time: effects of measurement method on relationships with risk biomarkers."

PloS one 7(5): e36345.

Chobanian, A. V., G. L. Bakris, et al. (2003). "The Seventh Report of the Joint National

Committee on Prevention, Detection, Evaluation, and Treatment of High Blood Pressure:

the JNC 7 report." JAMA 289(19): 2560-2572.

Choi, L., Z. Liu, et al. (2011). "Validation of accelerometer wear and nonwear time classification

algorithm." Medicine and science in sports and exercise 43(2): 357-364.

Clark, B. K., A. A. Thorp, et al. (2011). "Validity of self-reported measures of workplace sitting

time and breaks in sitting time." Medicine and science in sports and exercise 43(10): 1907-

1912.

Cleland, I., B. Kikhia, et al. (2013). "Optimal placement of accelerometers for the detection of

everyday activities." Sensors 13(7): 9183-9200.

Page 261: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

246

Colbert, L. H., C. E. Matthews, et al. (2011). "Comparative validity of physical activity measures

in older adults." Medicine and science in sports and exercise 43(5): 867-876.

Craft, L. L., T. W. Zderic, et al. (2012). "Evidence that women meeting physical activity guidelines

do not sit less: an observational inclinometry study." The international journal of behavioral

nutrition and physical activity 9: 122.

Crouter, S. E., C. Albright, et al. (2004). "Accuracy of polar S410 heart rate monitor to estimate

energy cost of exercise." Medicine and science in sports and exercise 36(8): 1433-1439.

Crouter, S. E. and D. R. Bassett, Jr. (2008). "A new 2-regression model for the Actical

accelerometer." British journal of sports medicine 42(3): 217-224.

Crouter, S. E., J. R. Churilla, et al. (2006). "Estimating energy expenditure using accelerometers."

European journal of applied physiology 98(6): 601-612.

Crouter, S. E., K. G. Clowers, et al. (2006). "A novel method for using accelerometer data to

predict energy expenditure." Journal of applied physiology 100(4): 1324-1331.

Crouter, S. E., E. Kuffel, et al. (2010). "Refined two-regression model for the ActiGraph

accelerometer." Medicine and science in sports and exercise 42(5): 1029-1037.

Dale, D., G. J. Welk, et al. (2002). Methods for Assessing Physical Activity and Challenges for

Research. Physical Activity Assessments for Health-Related Research. G. J. Welk.

Champaign, IL, Human Kinetics, Inc.: 19-36.

Dannecker, K. L., N. A. Sazonova, et al. (2013). "A comparison of energy expenditure estimation

of several physical activity monitors." Medicine and science in sports and exercise 45(11):

2105-2112.

De Vries, S. I., F. G. Garre, et al. (2011). "Evaluation of neural networks to identify types of

activity using accelerometers." Medicine and science in sports and exercise 43(1): 101-107.

DiNallo, J. M., D. S. Downs, et al. (2012). "Objectively assessing treadmill walking during the

second and third pregnancy trimesters." J Phys Act Health 9(1): 21-28.

DiNallo, J. M., D. S. Downs, et al. (2012). "Objectively assessing treadmill walking during the

second and third pregnancy trimesters." Journal of physical activity & health 9(1): 21-28.

Dingwell, J. B., J. P. Cusumano, et al. (2001). "Local dynamic stability versus kinematic variability

of continuous overground and treadmill walking." Journal of biomechanical engineering

123(1): 27-32.

Page 262: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

247

Dong, B., S. Biswas, et al. (2013). "Comparing metabolic energy expenditure estimation using

wearable multi-sensor network and single accelerometer." Conference proceedings : ...

Annual International Conference of the IEEE Engineering in Medicine and Biology

Society. IEEE Engineering in Medicine and Biology Society. Conference 2013: 2866-

2869.

Dong, B., A. Montoye, et al. (2013). "Energy-aware activity classification using wearable sensor

networks." 87230Y-87230Y.

Dunstan, D. W., B. Howard, et al. (2012). "Too much sitting--a health hazard." Diabetes research

and clinical practice 97(3): 368-376.

Dwyer, T. J., J. A. Alison, et al. (2009). "Evaluation of the SenseWear activity monitor during

exercise in cystic fibrosis and in health." Respiratory medicine 103(10): 1511-1517.

Ekelund, U., S. Brage, et al. (2009). "Objectively measured moderate- and vigorous-intensity

physical activity but not sedentary time predicts insulin resistance in high-risk individuals."

Diabetes care 32(6): 1081-1086.

Erik Landhuis, C., R. Poulton, et al. (2008). "Programming obesity and poor fitness: the long-term

impact of childhood television." Obesity 16(6): 1457-1459.

Ermes, M., J. Parkka, et al. (2008). "Detection of daily activities and sports with wearable sensors

in controlled and uncontrolled conditions." IEEE transactions on information technology in

biomedicine : a publication of the IEEE Engineering in Medicine and Biology Society

12(1): 20-26.

Esliger, D. W., A. V. Rowlands, et al. (2011). "Validation of the GENEA Accelerometer."

Medicine and science in sports and exercise 43(6): 1085-1093.

Esliger, D. W. and M. S. Tremblay (2006). "Technical reliability assessment of three accelerometer

models in a mechanical setup." Medicine and science in sports and exercise 38(12): 2173-

2181.

Evenson, K. R. and J. W. Terry, Jr. (2009). "Assessment of differing definitions of accelerometer

nonwear time." Research quarterly for exercise and sport 80(2): 355-362.

Feito, Y., D. R. Bassett, et al. (2012). "Evaluation of activity monitors in controlled and free-living

environments." Medicine and science in sports and exercise 44(4): 733-741.

Feito, Y., D. R. Bassett, et al. (2011). "Effects of body mass index and tilt angle on output of two

wearable activity monitors." Med Sci Sports Exerc 43(5): 861-866.

Page 263: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

248

Ferro-Luzzi, A. (1968). "[Inter- and intra-individual variability of the human energy expenditure in

the rest position]." Bollettino della Societa italiana di biologia sperimentale 44(7): 633-637.

Field, A. (2009). Discovering Statistics Using SPSS. London, SAGE Publications Ltd.

Ford, E. S., M. B. Schulze, et al. (2010). "Television watching and incident diabetes: Findings

from the European Prospective Investigation into Cancer and Nutrition-Potsdam Study."

Journal of diabetes 2(1): 23-27.

Fortune, E., V. Lugade, et al. (2014). "Validity of using tri-axial accelerometers to measure human

movement - Part II: Step counts at a wide range of gait velocities." Medical engineering &

physics.

Foster, R. C., L. M. Lanningham-Foster, et al. (2005). "Precision and accuracy of an ankle-worn

accelerometer-based pedometer in step counting and energy expenditure." Prev Med 41(3-

4): 778-783.

Freedson, P. S., K. Lyden, et al. (2011). "Evaluation of artificial neural network algorithms for

predicting METs and activity type from accelerometer data: validation on an independent

sample." Journal of applied physiology 111(6): 1804-1812.

Freedson, P. S., E. Melanson, et al. (1998). "Calibration of the Computer Science and Applications,

Inc. accelerometer." Medicine and science in sports and exercise 30(5): 777-781.

Freedson, P. S., E. Melanson, et al. (1998). "Calibration of the Computer Science and Applications,

Inc. accelerometer." Med Sci Sports Exerc 30(5): 777-781.

Frost, C. and I. R. White (2005). "The effect of measurement error in risk factors that change over

time in cohort studies: do simple methods overcorrect for 'regression dilution'?" Int J

Epidemiol 34(6): 1359-1368.

Gabriel, K. P., J. J. McClain, et al. (2010). "Issues in accelerometer methodology: the role of epoch

length on estimates of physical activity and relationships with health outcomes in

overweight, post-menopausal women." The international journal of behavioral nutrition

and physical activity 7: 53.

GENEActiv. (2013). "GENEAction: comprehensive data collection for every body." from

http://www.geneactive.co.uk/products/geneactiv-action.aspx.

Gierach, G. L., S. C. Chang, et al. (2009). "Physical activity, sedentary behavior, and endometrial

cancer risk in the NIH-AARP Diet and Health Study." International journal of cancer.

Journal international du cancer 124(9): 2139-2147.

Page 264: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

249

Grant, P. M., C. G. Ryan, et al. (2006). "The validation of a novel activity monitor in the

measurement of posture and motion during everyday activities." British journal of sports

medicine 40(12): 992-997.

GRPS. (2012). "GRPS School Choice Expo." from http://www.grps.org/ourschools/high-schools.

Gyllensten, I. C. and A. G. Bonomi (2011). "Identifying types of physical activity with a single

accelerometer: evaluating laboratory-trained algorithms in daily life." IEEE transactions on

bio-medical engineering 58(9): 2656-2663.

Hagstromer, M., P. Oja, et al. (2006). "The International Physical Activity Questionnaire (IPAQ): a

study of concurrent and construct validity." Public Health Nutr 9(6): 755-762.

Ham, S. A., J. Kruger, et al. (2009). "Participation by US adults in sports, exercise, and recreational

physical activities." Journal of physical activity & health 6(1): 6-14.

Hamilton, M. T., D. G. Hamilton, et al. (2004). "Exercise physiology versus inactivity physiology:

an essential concept for understanding lipoprotein lipase regulation." Exercise and sport

sciences reviews 32(4): 161-166.

Hamilton, M. T., D. G. Hamilton, et al. (2007). "Role of low energy expenditure and sitting in

obesity, metabolic syndrome, type 2 diabetes, and cardiovascular disease." Diabetes

56(11): 2655-2667.

Hanggi, J. M., L. R. Phillips, et al. (2012). "Validation of the GT3X ActiGraph in children and

comparison with the GT1M ActiGraph." Journal of science and medicine in sport / Sports

Medicine Australia.

Hargens, A. R. and S. Richardson (2009). "Cardiovascular adaptations, fluid shifts, and

countermeasures related to space flight." Respiratory physiology & neurobiology 169

Suppl 1: S30-33.

Harrington, D. M., G. J. Welk, et al. (2011). "Validation of MET estimates and step measurement

using the ActivPAL physical activity logger." Journal of sports sciences 29(6): 627-633.

Harrison, C. L., R. G. Thompson, et al. (2011). "Measuring physical activity during pregnancy."

The international journal of behavioral nutrition and physical activity 8: 19.

Hart, T. L., B. E. Ainsworth, et al. (2011). "Objective and subjective measures of sedentary

behavior and physical activity." Medicine and science in sports and exercise 43(3): 449-

456.

Page 265: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

250

Haskell, W. L., M. C. Yee, et al. (1993). "Simultaneous measurement of heart rate and body

motion to quantitate physical activity." Medicine and science in sports and exercise 25(1):

109-115.

Haymes, E. M. and W. C. Byrnes (1993). "Walking and running energy expenditure estimated by

Caltrac and indirect calorimetry." Med Sci Sports Exerc 25(12): 1365-1369.

Healy, G. N., B. K. Clark, et al. (2011). "Measurement of adults' sedentary time in population-

based studies." American journal of preventive medicine 41(2): 216-227.

Healy, G. N., D. W. Dunstan, et al. (2007). "Objectively measured light-intensity physical activity

is independently associated with 2-h plasma glucose." Diabetes care 30(6): 1384-1389.

Healy, G. N., D. W. Dunstan, et al. (2008). "Breaks in sedentary time: beneficial associations with

metabolic risk." Diabetes care 31(4): 661-666.

Healy, G. N., D. W. Dunstan, et al. (2008). "Television time and continuous metabolic risk in

physically active adults." Medicine and science in sports and exercise 40(4): 639-645.

Healy, G. N., C. E. Matthews, et al. (2011). "Sedentary time and cardio-metabolic biomarkers in

US adults: NHANES 2003-06." European heart journal 32(5): 590-597.

Healy, G. N., K. Wijndaele, et al. (2008). "Objectively measured sedentary time, physical activity,

and metabolic risk: the Australian Diabetes, Obesity and Lifestyle Study (AusDiab)."

Diabetes care 31(2): 369-371.

Heiermann, S., K. Khalaj Hedayati, et al. (2011). "Accuracy of a portable multisensor body

monitor for predicting resting energy expenditure in older people: a comparison with

indirect calorimetry." Gerontology 57(5): 473-479.

Heil, D. P. (2006). "Predicting activity energy expenditure using the Actical activity monitor."

Research quarterly for exercise and sport 77(1): 64-80.

Helmerhorst, H. J., K. Wijndaele, et al. (2009). "Objectively measured sedentary time may predict

insulin resistance independent of moderate- and vigorous-intensity physical activity."

Diabetes 58(8): 1776-1779.

Hendelman, D., K. Miller, et al. (2000). "Validity of accelerometry for the assessment of moderate

intensity physical activity in the field." Medicine and science in sports and exercise 32(9

Suppl): S442-449.

Herren, R., A. Sparti, et al. (1999). "The prediction of speed and incline in outdoor running in

humans using accelerometry." Medicine and science in sports and exercise 31(7): 1053-

1059.

Page 266: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

251

Herrmann, S. D., T. V. Barreira, et al. (2012). "Impact of accelerometer wear time on physical

activity data: a NHANES semisimulation data approach." British journal of sports

medicine.

Hjorth, M. F., J. P. Chaput, et al. (2012). "Measure of sleep and physical activity by a single

accelerometer: Can a waist-worn Actigraph adequately measure sleep in children?" Sleep

and Biological Rhythms 10(4): 328-335.

Hoevenaar-Blom, M. P., A. M. Spijkerman, et al. (2011). "Sleep duration and sleep quality in

relation to 12-year cardiovascular disease incidence: the MORGEN study." Sleep 34(11):

1487-1492.

Howard, R. A., D. M. Freedman, et al. (2008). "Physical activity, sedentary behavior, and the risk

of colon and rectal cancer in the NIH-AARP Diet and Health Study." Cancer causes &

control : CCC 19(9): 939-953.

Hu, F. B., M. F. Leitzmann, et al. (2001). "Physical activity and television watching in relation to

risk for type 2 diabetes mellitus in men." Archives of internal medicine 161(12): 1542-

1548.

Hu, F. B., T. Y. Li, et al. (2003). "Television watching and other sedentary behaviors in relation to

risk of obesity and type 2 diabetes mellitus in women." JAMA : the journal of the

American Medical Association 289(14): 1785-1791.

Jakicic, J. M., M. Marcus, et al. (2004). "Evaluation of the SenseWear Pro Armband to assess

energy expenditure during exercise." Medicine and science in sports and exercise 36(5):

897-904.

Janz, K. F. (2002). Use of Heart Rate Monitors to Assess Physical Activity. Physical Activity

Assessments for Health-Related Research. G. J. Welk. Champaign, IL, Human Kinetics,

Inc.: 143-162.

Janz, K. F., J. Witt, et al. (1995). "The stability of children's physical activity as measured by

accelerometry and self-report." Medicine and science in sports and exercise 27(9): 1326-

1332.

Jean-Louis, G., D. F. Kripke, et al. (2001). "Sleep detection with an accelerometer actigraph:

comparisons with polysomnography." Physiol Behav 72(1-2): 21-28.

Johnstone, A. M., S. D. Murison, et al. (2005). "Factors influencing variation in basal metabolic

rate include fat-free mass, fat mass, age, and circulating thyroxine but not sex, circulating

leptin, or triiodothyronine." The American journal of clinical nutrition 82(5): 941-948.

Page 267: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

252

Kampert, J. B., S. N. Blair, et al. (1996). "Physical activity, physical fitness, and all-cause and

cancer mortality: a prospective study of men and women." Annals of epidemiology 6(5):

452-457.

Katzmarzyk, P. T. (2014). "Standing and mortality in a prospective cohort of canadian adults."

Medicine and science in sports and exercise 46(5): 940-946.

Katzmarzyk, P. T., T. S. Church, et al. (2009). "Sitting time and mortality from all causes,

cardiovascular disease, and cancer." Medicine and science in sports and exercise 41(5):

998-1005.

Kenney, W., J. Wilmore, et al. (2012). Physiology of sport and exercise. Champaign, IL, Human

Kinetics.

Khan, A. M., Y. K. Lee, et al. (2008). "Accelerometer signal-based human activity recognition

using augmented autoregressive model coefficients and artificial neural nets." Conference

proceedings : ... Annual International Conference of the IEEE Engineering in Medicine and

Biology Society. IEEE Engineering in Medicine and Biology Society. Conference 2008:

5172-5175.

Khan, A. M., Y. K. Lee, et al. (2010). "A triaxial accelerometer-based physical-activity recognition

via augmented-signal features and a hierarchical recognizer." IEEE transactions on

information technology in biomedicine : a publication of the IEEE Engineering in

Medicine and Biology Society 14(5): 1166-1172.

Kierkegaard, A., L. Norgren, et al. (1987). "Incidence of deep vein thrombosis in bedridden non-

surgical patients." Acta medica Scandinavica 222(5): 409-414.

Kinder, J. R., K. A. Lee, et al. (2012). "Validation of a hip-worn accelerometer in measuring sleep

time in children." J Pediatr Nurs 27(2): 127-133.

King, A. C. and D. L. Tribble (1991). "The role of exercise in weight regulation in nonathletes."

Sports medicine 11(5): 331-349.

Kozey-Keadle, S., A. Libertine, et al. (2011). "Validation of wearable monitors for assessing

sedentary behavior." Medicine and science in sports and exercise 43(8): 1561-1567.

Krahenbuhl, G. S. and T. J. Williams (1992). "Running economy: changes with age during

childhood and adolescence." Medicine and science in sports and exercise 24(4): 462-466.

Kripke, D. F., D. J. Mullaney, et al. (1978). "Wrist actigraphic measures of sleep and rhythms."

Electroencephalogr Clin Neurophysiol 44(5): 674-676.

Page 268: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

253

Lagerros, Y. T. and P. Lagiou (2007). "Assessment of physical activity and energy expenditure in

epidemiological research of chronic diseases." Eur J Epidemiol 22(6): 353-362.

LaPorte, R. E., L. H. Kuller, et al. (1979). "An objective measure of physical activity for

epidemiologic research." American journal of epidemiology 109(2): 158-168.

LaPorte, R. E., H. J. Montoye, et al. (1985). "Assessment of physical activity in epidemiologic

research: problems and prospects." Public health reports 100(2): 131-146.

Le Masurier, G. C., C. L. Sidman, et al. (2003). "Accumulating 10,000 steps: does this meet

current physical activity guidelines?" Research quarterly for exercise and sport 74(4): 389-

394.

Lee, I. M. and P. J. Skerrett (2001). "Physical activity and all-cause mortality: what is the dose-

response relation?" Medicine and science in sports and exercise 33(6 Suppl): S459-471;

discussion S493-454.

Lee, J. M., Y. Kim, et al. (2014). "Validity of Consumer-Based Physical Activity Monitors."

Medicine and science in sports and exercise.

Levine, J. A., N. L. Eberhardt, et al. (1999). "Role of nonexercise activity thermogenesis in

resistance to fat gain in humans." Science 283(5399): 212-214.

Levine, J. A., L. M. Lanningham-Foster, et al. (2005). "Interindividual variation in posture

allocation: possible role in human obesity." Science 307(5709): 584-586.

Lord, S., S. F. Chastin, et al. (2011). "Exploring patterns of daily physical and sedentary behaviour

in community-dwelling older adults." Age Ageing 40(2): 205-210.

Lyden, K. (2012). Refinement, validation and application of a machine learning method for

estimating physical activity and sedentary behavior in free-living people. Dissertation.

Amherst, MA.

Lyden, K., S. K. Keadle, et al. (2013). "A Method to Estimate Free-Living Active and Sedentary

Behavior from an Accelerometer." Medicine and science in sports and exercise.

Lyden, K., S. L. Kozey Keadle, et al. (2012). "Validity of two wearable monitors to estimate

breaks from sedentary time." Medicine and science in sports and exercise 44(11): 2243-

2252.

Lyden, K., S. L. Kozey, et al. (2011). "A comprehensive evaluation of commonly used

accelerometer energy expenditure and MET prediction equations." European journal of

applied physiology 111(2): 187-201.

Page 269: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

254

Lyden, K., N. Petruski, et al. (2013). "Direct Observation is a Valid Criterion for Estimating

Physical Activity and Sedentary Behavior." Journal of physical activity & health.

MacMahon, S., R. Peto, et al. (1990). "Blood pressure, stroke, and coronary heart disease. Part 1,

Prolonged differences in blood pressure: prospective observational studies corrected for the

regression dilution bias." Lancet 335(8692): 765-774.

Maddocks, M., A. Petrou, et al. (2010). "Validity of three accelerometers during treadmill walking

and motor vehicle travel." British journal of sports medicine 44(8): 606-608.

Malina, R. (1995). "Anthropometry." Physiological assessment of human fitness: 205-219.

Mannini, A., S. S. Intille, et al. (2013). "Activity recognition using a single accelerometer placed at

the wrist or ankle." Med Sci Sports Exerc.

Mannini, A. and A. M. Sabatini (2010). "Machine learning methods for classifying human physical

activity from on-body accelerometers." Sensors (Basel) 10(2): 1154-1175.

Manson, J. E., D. M. Nathan, et al. (1992). "A prospective study of exercise and incidence of

diabetes among US male physicians." JAMA : the journal of the American Medical

Association 268(1): 63-67.

Martin, A., M. McNeill, et al. (2011). "Objective measurement of habitual sedentary behavior in

pre-school children: comparison of activPAL With Actigraph monitors." Pediatric exercise

science 23(4): 468-476.

Martinsen, E. W., A. Hoffart, et al. (1989). "Comparing aerobic with nonaerobic forms of exercise

in the treatment of clinical depression: a randomized trial." Comprehensive psychiatry

30(4): 324-331.

Masse, L. C., B. F. Fuemmeler, et al. (2005). "Accelerometer data reduction: a comparison of four

reduction algorithms on select outcome variables." Medicine and science in sports and

exercise 37(11 Suppl): S544-554.

Matthews, C. E. (2005). "Calibration of accelerometer output for adults." Medicine and science in

sports and exercise 37(11 Suppl): S512-522.

Matthews, C. E., B. E. Ainsworth, et al. (2002). "Sources of variance in daily physical activity

levels as measured by an accelerometer." Medicine and science in sports and exercise

34(8): 1376-1381.

Matthews, C. E., K. Y. Chen, et al. (2008). "Amount of time spent in sedentary behaviors in the

United States, 2003-2004." American journal of epidemiology 167(7): 875-881.

Page 270: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

255

Matthews, C. E., S. C. Moore, et al. (2012). "Improving self-reports of active and sedentary

behaviors in large epidemiologic studies." Exercise and sport sciences reviews 40(3): 118-

126.

McClain, J. J., S. B. Sisson, et al. (2007). "Actigraph accelerometer interinstrument reliability

during free-living in adults." Medicine and science in sports and exercise 39(9): 1509-1514.

McKenzie, T. (2002). Use of direct observation to assess physical activity. Physical Activity

Assessments for Health-Related Research. G. Welk. Champaign, IL, Kunan Kinetics, Inc.:

179-195.

Melanson, E. L., Jr. and P. S. Freedson (1995). "Validity of the Computer Science and

Applications, Inc. (CSA) activity monitor." Medicine and science in sports and exercise

27(6): 934-940.

Metcalf, B. S., J. S. Curnow, et al. (2002). "Technical reliability of the CSA activity monitor: The

EarlyBird Study." Medicine and science in sports and exercise 34(9): 1533-1537.

Metz, C. E. (1978). "Basic principles of ROC analysis." Seminars in nuclear medicine 8(4): 283-

298.

Mignault, D., M. St-Onge, et al. (2005). "Evaluation of the Portable HealthWear Armband: a

device to measure total daily energy expenditure in free-living type 2 diabetic individuals."

Diabetes care 28(1): 225-227.

Mikines, K. J., E. A. Richter, et al. (1991). "Seven days of bed rest decrease insulin action on

glucose uptake in leg and whole body." Journal of applied physiology 70(3): 1245-1254.

Montgomery-Downs, H. E., S. P. Insana, et al. (2012). "Movement toward a novel activity

monitoring device." Sleep & breathing = Schlaf & Atmung 16(3): 913-917.

Montoye, A., B. Dong, et al. (2013). Assessing the effect of accelerometer placement and

modeling method on energy expenditure measurement. East Lansing, MI.

Montoye, A., B. Dong, et al. (2014). "Use of a wireless network of accelerometers for improved

measurement of human energy expenditure." Electronics 3(2): 205-220.

Montoye, H., H. Kemper, et al. (1996). Measuring physical activity and energy expenditure.

Champaign, IL, Human Kinetics.

Montoye, H. J., R. Washburn, et al. (1983). "Estimation of energy expenditure by a portable

accelerometer." Medicine and science in sports and exercise 15(5): 403-407.

Page 271: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

256

Moon, J. K. and N. F. Butte (1996). "Combined heart rate and activity improve estimates of

oxygen consumption and carbon dioxide production rates." Journal of applied physiology

81(4): 1754-1761.

Morris, J. N., D. G. Clayton, et al. (1990). "Exercise in leisure time: coronary attack and death

rates." Br Heart J 63(6): 325-334.

Morris, J. N., D. G. Clayton, et al. (1990). "Exercise in leisure time: coronary attack and death

rates." British heart journal 63(6): 325-334.

Morris, J. N., J. A. Heady, et al. (1953). "Coronary heart-disease and physical activity of work."

Lancet 265(6796): 1111-1120; concl.

Moy, K. L., J. F. Sallis, et al. (2010). "Culturally-specific physical activity measures for Native

Hawaiian and Pacific Islanders." Hawaii medical journal 69(5 Suppl 2): 21-24.

Mullaney, D. J., D. F. Kripke, et al. (1980). "Wrist-actigraphic estimation of sleep time." Sleep

3(1): 83-92.

Nichols, J. F., C. G. Morgan, et al. (1999). "Validity, reliability, and calibration of the Tritrac

accelerometer as a measure of physical activity." Med Sci Sports Exerc 31(6): 908-912.

Oliver, M., H. M. Badland, et al. (2011). "Identification of accelerometer nonwear time and

sedentary behavior." Research quarterly for exercise and sport 82(4): 779-783.

Orendurff, M. S., J. A. Schoen, et al. (2008). "How humans walk: bout duration, steps per bout,

and rest duration." Journal of rehabilitation research and development 45(7): 1077-1089.

Orme, M., K. Wijndaele, et al. (2014). "Combined influence of epoch length, cut-point and bout

duration on accelerometry-derived physical activity." The international journal of

behavioral nutrition and physical activity 11(1): 34.

Owen, N., G. N. Healy, et al. (2010). "Too much sitting: the population health science of sedentary

behavior." Exercise and sport sciences reviews 38(3): 105-113.

Paffenbarger, R. S., Jr., R. T. Hyde, et al. (1986). "Physical activity, all-cause mortality, and

longevity of college alumni." N Engl J Med 314(10): 605-613.

Paffenbarger, R. S., Jr., A. L. Wing, et al. (1983). "Physical activity and incidence of hypertension

in college alumni." Am J Epidemiol 117(3): 245-257.

PAGAC (2008). Physical Activity Guidlines Advisory Committee Report, 2008. Washington, DC,

US Department of Health and Human Services.

Page 272: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

257

Papazoglou, D., G. Augello, et al. (2006). "Evaluation of a multisensor armband in estimating

energy expenditure in obese individuals." Obesity 14(12): 2217-2223.

Parikh, R., A. Mathai, et al. (2008). "Understanding and using sensitivity, specificity and predictive

values." Indian journal of ophthalmology 56(1): 45-50.

Park, J. Y., S. E. Vollset, et al. (2013). "Dietary intake and biological measurement of folate: a

qualitative review of validation studies." Molecular nutrition & food research 57(4): 562-

581.

Parkka, J., M. Ermes, et al. (2006). "Activity classification using realistic data from wearable

sensors." IEEE transactions on information technology in biomedicine : a publication of the

IEEE Engineering in Medicine and Biology Society 10(1): 119-128.

Pate, R. R., J. R. O'Neill, et al. (2008). "The evolving definition of "sedentary"." Exercise and sport

sciences reviews 36(4): 173-178.

Pate, R. R., M. Pratt, et al. (1995). "Physical activity and public health. A recommendation from

the Centers for Disease Control and Prevention and the American College of Sports

Medicine." JAMA : the journal of the American Medical Association 273(5): 402-407.

Patel, A. V., C. Rodriguez, et al. (2006). "Recreational physical activity and sedentary behavior in

relation to ovarian cancer risk in a large cohort of US women." American journal of

epidemiology 163(8): 709-716.

Plasqui, G. and K. R. Westerterp (2005). "Accelerometry and heart rate as a measure of physical

fitness: proof of concept." Medicine and science in sports and exercise 37(5): 872-876.

Ploug, T., T. Ohkuwa, et al. (1995). "Effect of immobilization on glucose transport and glucose

transporter expression in rat skeletal muscle." The American journal of physiology 268(5

Pt 1): E980-986.

Pober, D. M., J. Staudenmayer, et al. (2006). "Development of novel techniques to classify

physical activity mode using accelerometers." Medicine and science in sports and exercise

38(9): 1626-1634.

Precope, J. (1952). Hippocrates on diet and hygiene. London, UK, Williams, Lea, and Company.

Preece, S. J., J. Y. Goulermas, et al. (2009). "Activity identification using body-mounted sensors--a

review of classification techniques." Physiological measurement 30(4): R1-33.

Puhl, J., K. Greaves, et al. (1990). "Children's Activity Rating Scale (CARS): description and

calibration." Research quarterly for exercise and sport 61(1): 26-36.

Page 273: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

258

Reilly, J. J., V. Penpraze, et al. (2008). "Objective measurement of physical activity and sedentary

behaviour: review with new data." Archives of disease in childhood 93(7): 614-619.

Riddoch, C. J., L. Bo Andersen, et al. (2004). "Physical activity levels and patterns of 9- and 15-yr-

old European children." Medicine and science in sports and exercise 36(1): 86-92.

Rosdahl, H., L. Gullstrand, et al. (2010). "Evaluation of the Oxycon Mobile metabolic system

against the Douglas bag method." European journal of applied physiology 109(2): 159-171.

Rosenberger, M. E., W. L. Haskell, et al. (2013). "Estimating activity and sedentary behavior from

an accelerometer on the hip or wrist." Med Sci Sports Exerc 45(5): 964-975.

Rothney, M. P., M. Neumann, et al. (2007). "An artificial neural network model of energy

expenditure using nonintegrated acceleration signals." Journal of applied physiology

103(4): 1419-1427.

Rothney, M. P., E. V. Schaefer, et al. (2008). "Validity of physical activity intensity predictions by

ActiGraph, Actical, and RT3 accelerometers." Obesity 16(8): 1946-1952.

Rowlands, A. V., T. S. Olds, et al. (2014). "Assessing Sedentary Behavior with the GENEActiv:

Introducing the Sedentary Sphere." Medicine and science in sports and exercise 46(6):

1235-1247.

Rumo, M., O. Amft, et al. (2011). "A stepwise validation of a wearable system for estimating

energy expenditure in field-based research." Physiological measurement 32(12): 1983-

2001.

Ryan, C. G., P. M. Grant, et al. (2006). "The validity and reliability of a novel activity monitor as a

measure of walking." British journal of sports medicine 40(9): 779-784.

Safrit, M. and T. Wood (1995). Introduction to measurement in physical education and exercise

science. St. Louis, MO, Mosby.

Sallis, J. F., M. J. Buono, et al. (1990). "The Caltrac accelerometer as a physical activity monitor

for school-age children." Medicine and science in sports and exercise 22(5): 698-703.

Sallis, J. F. and B. E. Saelens (2000). "Assessment of physical activity by self-report: status,

limitations, and future directions." Research quarterly for exercise and sport 71(2 Suppl):

S1-14.

Santos-Lozano, A., P. J. Marin, et al. (2012). "Technical variability of the GT3X accelerometer."

Med Eng Phys 34(6): 787-790.

Page 274: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

259

Santos-Lozano, A., G. Torres-Luque, et al. (2012). "Intermonitor variability of GT3X

accelerometer." International journal of sports medicine 33(12): 994-999.

Sasaki, J. E., D. John, et al. (2011). "Validation and comparison of ActiGraph activity monitors."

Journal of science and medicine in sport / Sports Medicine Australia 14(5): 411-416.

SBRN (2012). "Letter to the Editor: Standardized use of the terms "sedentary" and "sedentary

behaviours"." Appl Physiol Nutr Metab 37: 540-542.

Schrage, W. G. (2008). "Not a search in vein: novel stimulus for vascular dysfunction after

simulated microgravity." Journal of applied physiology 104(5): 1257-1258.

Seider, M. J., W. F. Nicholson, et al. (1982). "Insulin resistance for glucose metabolism in disused

soleus muscle of mice." The American journal of physiology 242(1): E12-18.

Shephard, R. J. (1990). "Physical activity and cancer." International journal of sports medicine

11(6): 413-420.

Shephard, R. J. (2003). "Limits to the measurement of habitual physical activity by

questionnaires." British journal of sports medicine 37(3): 197-206; discussion 206.

Shepherd, E. F., E. Toloza, et al. (1999). "Step activity monitor: increased accuracy in quantifying

ambulatory activity." J Orthop Res 17(5): 703-708.

Shields, M. and M. S. Tremblay (2008). "Sedentary behaviour and obesity." Health reports /

Statistics Canada, Canadian Centre for Health Information = Rapports sur la sante /

Statistique Canada, Centre canadien d'information sur la sante 19(2): 19-30.

Skotte, J., M. Korshoj, et al. (2012). "Detection of Physical Activity Types Using Triaxial

Accelerometers." Journal of Physical Activity & Health.

Skotte, J., M. Korshoj, et al. (2014). "Detection of physical activity types using triaxial

accelerometers." Journal of physical activity & health 11(1): 76-84.

Slattery, M. L. (2004). "Physical activity and colorectal cancer." Sports medicine 34(4): 239-252.

Slootmaker, S. M., A. J. Schuit, et al. (2009). "Disagreement in physical activity assessed by

accelerometer and self-report in subgroups of age, gender, education and weight status."

The international journal of behavioral nutrition and physical activity 6: 17.

Smorawinski, J., P. Kubala, et al. (1996). "Effects of three day bed-rest on circulatory, metabolic

and hormonal responses to oral glucose load in endurance trained athletes and untrained

subjects." Journal of gravitational physiology : a journal of the International Society for

Gravitational Physiology 3(2): 44-45.

Page 275: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

260

Spurr, G. B., A. M. Prentice, et al. (1988). "Energy expenditure from minute-by-minute heart-rate

recording: comparison with indirect calorimetry." The American journal of clinical

nutrition 48(3): 552-559.

Stamatakis, E., M. Hamer, et al. (2011). "Screen-based entertainment time, all-cause mortality, and

cardiovascular events: population-based study with ongoing mortality and hospital events

follow-up." Journal of the American College of Cardiology 57(3): 292-299.

Staudenmayer, J., D. Pober, et al. (2009). "An artificial neural network to estimate physical activity

energy expenditure and identify physical activity type from an accelerometer." Journal of

applied physiology 107(4): 1300-1307.

Strath, S. J., D. R. Bassett, Jr., et al. (2001). "Simultaneous heart rate-motion sensor technique to

estimate energy expenditure." Medicine and science in sports and exercise 33(12): 2118-

2123.

Strath, S. J., D. R. Bassett, Jr., et al. (2002). "Validity of the simultaneous heart rate-motion sensor

technique for measuring energy expenditure." Medicine and science in sports and exercise

34(5): 888-894.

Stuart, C. A., R. E. Shangraw, et al. (1988). "Bed-rest-induced insulin resistance occurs primarily

in muscle." Metabolism: clinical and experimental 37(8): 802-806.

Sun, D. X., G. Schmidt, et al. (2008). "Validation of the RT3 accelerometer for measuring physical

activity of children in simulated free-living conditions." Pediatric exercise science 20(2):

181-197.

Swartz, A. M., L. Squires, et al. (2011). "Energy expenditure of interruptions to sedentary

behavior." The international journal of behavioral nutrition and physical activity 8: 69.

Swartz, A. M., S. J. Strath, et al. (2000). "Estimation of energy expenditure using CSA

accelerometers at hip and wrist sites." Medicine and science in sports and exercise 32(9

Suppl): S450-456.

Tapia, E. M., S. S. Intillie, et al. (2007). "Real-time recognition of physical activities and their

intensities using wireless accelerometers and a heart rate monitor." Proceedings of the

International Symposium on Wearable Computers.

Thompson, F. and A. Subar (2013). Dietary assessment methodology. Nutrition in the prevention

and treatment of disease. A. Coulston, C. Boushey and M. Ferruzzi. London, UK, Elsevier.

3.

Page 276: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

261

Thorp, A. A., N. Owen, et al. (2011). "Sedentary behaviors and subsequent health outcomes in

adults a systematic review of longitudinal studies, 1996-2011." American journal of

preventive medicine 41(2): 207-215.

Thune, I. and A. S. Furberg (2001). "Physical activity and cancer risk: dose-response and cancer,

all sites and site-specific." Medicine and science in sports and exercise 33(6 Suppl): S530-

550; discussion S609-510.

Tobin, B. W., P. N. Uchakin, et al. (2002). "Insulin secretion and sensitivity in space flight:

diabetogenic effects." Nutrition 18(10): 842-848.

Troiano, R. P., D. Berrigan, et al. (2008). "Physical activity in the United States measured by

accelerometer." Medicine and science in sports and exercise 40(1): 181-188.

Troiano, R. P. and J. J. McClain (2012). Objective measuremes of physical activity, strength, sleep,

and strength in US National Health and Nutrition Examination Survey (NHANES) 2011-

2014. The 8th International Conference on Diet and Activity Methods, Rome, Italy.

Troiano, R. P., J. J. McClain, et al. (2014). "Evolution of accelerometer methods for physical

activity research." British journal of sports medicine.

Trost, S. G., K. L. McIver, et al. (2005). "Conducting accelerometer-based activity assessments in

field-based research." Medicine and science in sports and exercise 37(11): S531-S543.

Trost, S. G., W. K. Wong, et al. (2012). "Artificial neural networks to predict activity type and

energy expenditure in youth." Medicine and science in sports and exercise 44(9): 1801-

1809.

Tudor-Locke, C. E. and A. M. Myers (2001). "Challenges and opportunities for measuring

physical activity in sedentary adults." Sports medicine 31(2): 91-100.

UBCC. (2009). "Category 2 enhanced phenotyping at baseline assessment visit in last 100-150,000

participants." from http://www.ukbiobank.ac.uk/wp-

content/uploads/2011/06/Protocol_addendum_2.pdf.

van Hees, V. T., R. Golubic, et al. (2013). "Impact of study design on development and evaluation

of an activity-type classifier." Journal of applied physiology 114(8): 1042-1051.

van Poppel, M. N., M. J. Chinapaw, et al. (2010). "Physical activity questionnaires for adults: a

systematic review of measurement properties." Sports medicine 40(7): 565-600.

Vanhelst, J., G. Baquet, et al. (2012). "Comparative interinstrument reliability of uniaxial and

triaxial accelerometers in free-living conditions." Percept Mot Skills 114(2): 584-594.

Page 277: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

262

Veltink, P. H., H. B. Bussmann, et al. (1996). "Detection of static and dynamic activities using

uniaxial accelerometers." IEEE Trans Rehabil Eng 4(4): 375-385.

Webster, J. B., D. F. Kripke, et al. (1982). "An activity-based sleep monitor system for ambulatory

use." Sleep 5(4): 389-399.

Welch, W. A., D. R. Bassett, et al. (2014). "Cross-validation of Waist-Worn GENEA

Accelerometer Cut-Points." Medicine and science in sports and exercise.

Welch, W. A., D. R. Bassett, et al. (2013). "Classification accuracy of the wrist-worn gravity

estimator of normal everyday activity accelerometer." Medicine and science in sports and

exercise 45(10): 2012-2019.

Welk, G. J. (2002). "Reliability of the CSA activity monitor for assessing physical activity."

Research quarterly for exercise and sport 73: A14.

Welk, G. J. (2002). Use of Accelerometry-Based Activity Monitors to Assess Physical Activity.

Physical Activity Assessments for Health-Related Research. G. J. Welk. Champaign, IL,

Human Kinetics, Inc.: 125-142.

Welk, G. J. and C. B. Corbin (1995). "The validity of the Tritrac-R3D Activity Monitor for the

assessment of physical activity in children." Research quarterly for exercise and sport

66(3): 202-209.

Welk, G. J., J. J. McClain, et al. (2007). "Field validation of the MTI Actigraph and BodyMedia

armband monitor using the IDEEA monitor." Obesity 15(4): 918-928.

Westerterp, K. R. (1999). "Assessment of physical activity level in relation to obesity: current

evidence and research issues." Medicine and science in sports and exercise 31(11 Suppl):

S522-525.

Wijndaele, K., G. N. Healy, et al. (2010). "Increased cardiometabolic risk is associated with

increased TV viewing time." Medicine and science in sports and exercise 42(8): 1511-

1518.

Wong, T. C., J. G. Webster, et al. (1981). "Portable accelerometer device for measuring human

energy expenditure." IEEE transactions on bio-medical engineering 28(6): 467-471.

Yanagibori, R., K. Kondo, et al. (1998). "Effect of 20 days' bed rest on the reverse cholesterol

transport system in healthy young subjects." Journal of internal medicine 243(4): 307-312.

Yanagibori, R., Y. Suzuki, et al. (1997). "The effects of 20 days bed rest on serum lipids and

lipoprotein concentrations in healthy young subjects." Journal of gravitational physiology :

a journal of the International Society for Gravitational Physiology 4(1): S82-90.

Page 278: USE OF ACCELEROMETRY AND MACHINE LEARNING TO MEASURE FREE-LIVING PHYSICAL

263

Zderic, T. W. and M. T. Hamilton (2006). "Physical inactivity amplifies the sensitivity of skeletal

muscle to the lipid-induced downregulation of lipoprotein lipase activity." Journal of

applied physiology 100(1): 249-257.

Zerwekh, J. E., L. A. Ruml, et al. (1998). "The effects of twelve weeks of bed rest on bone

histology, biochemical markers of bone turnover, and calcium homeostasis in eleven

normal subjects." Journal of bone and mineral research : the official journal of the

American Society for Bone and Mineral Research 13(10): 1594-1601.

Zhang, K., F. X. Pi-Sunyer, et al. (2004). "Improving energy expenditure estimation for physical

activity." Medicine and science in sports and exercise 36(5): 883-889.

Zhang, K., P. Werner, et al. (2003). "Measurement of human daily physical activity." Obesity

research 11(1): 33-40.

Zhang, S., A. V. Rowlands, et al. (2012). "Physical activity classification using the GENEA wrist-

worn accelerometer." Medicine and science in sports and exercise 44(4): 742-748.