Recognition of Nutrition Intake using Time-Frequency...

8
1530-437X (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/JSEN.2015.2402652, IEEE Sensors Journal 1 Recognition of Nutrition Intake using Time-Frequency Decomposition in a Wearable Necklace using a Piezoelectric Sensor Nabil Alshurafa, Student Member, IEEE, Haik Kalantarian, Student Member, IEEE, Mohammad Pourhomayoun, Student Member, IEEE, Jason J. Liu, Student Member, IEEE, Shruti Sarin, Student Member, IEEE, Behnam Shahbazi, Student Member, IEEE, and Majid Sarrafzadeh, Fellow, IEEE Abstract—Food intake levels, hydration, ingestion rate, and dietary choices are all factors known to impact the risk of obesity. This paper presents a novel wearable system in the form of a necklace, which aggregates data from an embedded piezoelectric sensor capable of detecting skin motion in the lower trachea during ingestion. The skin motion produces an output voltage with varying frequencies over time. As a result we propose an algorithm based on time-frequency decomposition, spectrogram analysis of piezoelectric sensor signals, to accurately distinguish between food types such as liquid and solid, hot and cold drinks and hard and soft foods. The necklace transmits data to a smartphone, which performs the processing of the signals, classifies the food type, and provides visual feedback to the user to assist the user in monitoring their eating habits over time. We compare our spectrogram analysis with other time-frequency features such as Matching Pursuit (MP) and Wavelets. Experimental results demonstrate promise in using time-frequency features, with high accuracy of distinguishing be- tween food categories using spectrogram analysis and extracting key features representative of the unique swallow patterns of various foods. Index Terms—Nutrition Monitoring; Wearable Necklace; Spec- trogram Analysis; Piezoelectric Sensor; Machine Learning; Clas- sification. I. MOTIVATION AND BACKGROUND Healthy eating is associated with reduced risk for many diseases, including several of the leading causes of death: heart disease, some cancers, stroke, and diabetes [1]. The development and the incorporation of wireless technologies has the potential to address our ultimate goal of enabling healthier lifestyle choices and behavior modification needed to prevent obesity and obesity-related diseases. Much of the wireless technology developed and used in the market, howev- er, focuses primarily on exercise and physical activity [7], [3], [4], [6], [15], [32]. In this paper, we describe a novel system that attempts to infer eating patterns from a device disguised as a necklace. Automatically and accurately recognizing the type of food in a non-intrusive manner has been for the most part an unaddressed challenge. Most of the current technologies for eating pattern detection are either inaccurate or exhibit low N. Alshurafa, H. Kalantarian, M. Pouhomayoun, J. J. Liu, S. Sarin, B. Shahbazi and M. Sarrafzadeh are with the Wireless Health Institute, Department of Computer Science, University of California, Los Angeles, 90095. *Email correspondence to [email protected] Eating too fast Stay hydrated Skipping meals Need to eat more Feedback Eating Habits Data Processing Fig. 1. Monitoring eating habits is essential to promoting healthy lifestyle behavior. In a fast-paced society, some common bad eating habits include eating too fast, skipping meals, not eating enough, and being dehydrated. Having a non-invasive automated method of detecting bad habits can be a means of changing negative habits into positive ones. rates of adherence to using the technology, due to one or more of these shortcomings: 1) they infer eating indirectly from, for example, hand movements or food images [10], [37]; 2) they require manual data entry or user involvement in capturing data [19]; or 3) they are non-wearable, bulky, invasive, or semi-invasive [11]. There is a need for a system that is non- invasive and detects individual’s eating patterns, and provides necessary guidance and feedback to the user (see Figure 1). Such a system represents a significant advance in researchers’ ability to evaluate the combined impact of adherence to dietary guidelines. Current systems either have low accuracy in detecting swallows and distinguishing food types, or must be uncom- fortably worn around the neck, which renders continuous use impractical [8]. In this paper, we focus on a piezoelectric-

Transcript of Recognition of Nutrition Intake using Time-Frequency...

Page 1: Recognition of Nutrition Intake using Time-Frequency ...web.cs.ucla.edu/~bshahbazi/publications/Nabil_necklace.pdfof the necklace, along with a small microcontroller board capable

1530-437X (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/JSEN.2015.2402652, IEEE Sensors Journal

1

Recognition of Nutrition Intake usingTime-Frequency Decomposition in a Wearable

Necklace using a Piezoelectric SensorNabil Alshurafa, Student Member, IEEE, Haik Kalantarian, Student Member, IEEE,

Mohammad Pourhomayoun, Student Member, IEEE, Jason J. Liu, Student Member, IEEE, Shruti Sarin, StudentMember, IEEE, Behnam Shahbazi, Student Member, IEEE, and Majid Sarrafzadeh, Fellow, IEEE

Abstract—Food intake levels, hydration, ingestion rate, anddietary choices are all factors known to impact the risk ofobesity. This paper presents a novel wearable system in theform of a necklace, which aggregates data from an embeddedpiezoelectric sensor capable of detecting skin motion in thelower trachea during ingestion. The skin motion produces anoutput voltage with varying frequencies over time. As a resultwe propose an algorithm based on time-frequency decomposition,spectrogram analysis of piezoelectric sensor signals, to accuratelydistinguish between food types such as liquid and solid, hot andcold drinks and hard and soft foods. The necklace transmitsdata to a smartphone, which performs the processing of thesignals, classifies the food type, and provides visual feedbackto the user to assist the user in monitoring their eating habitsover time. We compare our spectrogram analysis with othertime-frequency features such as Matching Pursuit (MP) andWavelets. Experimental results demonstrate promise in usingtime-frequency features, with high accuracy of distinguishing be-tween food categories using spectrogram analysis and extractingkey features representative of the unique swallow patterns ofvarious foods.

Index Terms—Nutrition Monitoring; Wearable Necklace; Spec-trogram Analysis; Piezoelectric Sensor; Machine Learning; Clas-sification.

I. MOTIVATION AND BACKGROUND

Healthy eating is associated with reduced risk for manydiseases, including several of the leading causes of death:heart disease, some cancers, stroke, and diabetes [1]. Thedevelopment and the incorporation of wireless technologieshas the potential to address our ultimate goal of enablinghealthier lifestyle choices and behavior modification neededto prevent obesity and obesity-related diseases. Much of thewireless technology developed and used in the market, howev-er, focuses primarily on exercise and physical activity [7], [3],[4], [6], [15], [32]. In this paper, we describe a novel systemthat attempts to infer eating patterns from a device disguisedas a necklace.

Automatically and accurately recognizing the type of foodin a non-intrusive manner has been for the most part anunaddressed challenge. Most of the current technologies foreating pattern detection are either inaccurate or exhibit low

N. Alshurafa, H. Kalantarian, M. Pouhomayoun, J. J. Liu, S. Sarin,B. Shahbazi and M. Sarrafzadeh are with the Wireless Health Institute,Department of Computer Science, University of California, Los Angeles,90095.

*Email correspondence to [email protected]

Eating too fast

Stay hydrated

Skipping meals

Need to eat more

Feedback

Eating Habits

Data Processing

Fig. 1. Monitoring eating habits is essential to promoting healthy lifestylebehavior. In a fast-paced society, some common bad eating habits includeeating too fast, skipping meals, not eating enough, and being dehydrated.Having a non-invasive automated method of detecting bad habits can be ameans of changing negative habits into positive ones.

rates of adherence to using the technology, due to one or moreof these shortcomings: 1) they infer eating indirectly from, forexample, hand movements or food images [10], [37]; 2) theyrequire manual data entry or user involvement in capturingdata [19]; or 3) they are non-wearable, bulky, invasive, orsemi-invasive [11]. There is a need for a system that is non-invasive and detects individual’s eating patterns, and providesnecessary guidance and feedback to the user (see Figure 1).Such a system represents a significant advance in researchers’ability to evaluate the combined impact of adherence to dietaryguidelines.

Current systems either have low accuracy in detectingswallows and distinguishing food types, or must be uncom-fortably worn around the neck, which renders continuous useimpractical [8]. In this paper, we focus on a piezoelectric-

Page 2: Recognition of Nutrition Intake using Time-Frequency ...web.cs.ucla.edu/~bshahbazi/publications/Nabil_necklace.pdfof the necklace, along with a small microcontroller board capable

1530-437X (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/JSEN.2015.2402652, IEEE Sensors Journal

2

based design of a necklace that is not worn tightly around theneck, but rather hangs loosely and falls more naturally rightabove the sternum [24], [5], [20], [21], [23].

Equally critical to detecting eating episodes is determiningwhether calories are consumed in solid or liquid form. Studiesshow reduced ability of the body to compensate properly whencalories are consumed in liquid form compared to solid form[30], with the result that some health recommendations nowexplicitly recommend restrictions on liquid calories consumed(e.g., 2007 report by the World Cancer Research Fund [16].The proposed system will enable individuals to track theamount of solid vs. liquid consumed throughout the day.

This paper is organized as follows. Related works aredescribed in Section II. In Section III we describe the nu-trition monitoring necklace. Then we describe our methodof differentiating food types using spectrogram-based featureextraction and classification algorithm in Section IV. Wepresent our experimental setup in Section V, and results inSection VI. We conclude in Section VII.

II. RELATED WORK

Various sensors have been employed to identify the volumeof food being consumed, and among the most popular methodsis acoustic detection [22]. Several systems have identifiedchewing and swallowing acoustically by placing a microphonenear the throat, and using signal processing techniques forclassification. For example, Sazanov et al. [35] uses acousticdata acquired from a small microphone placed near the bottomof the throat. However, their system is coupled with a straingauge placed near the ear which is not practical for dailyuse. Similarly, Nagae et al. [33] attempts to distinguish be-tween swallowing, coughing, and vocalization using wavelet-transform analysis of audio data. Though results are promising,this technology is targeted towards those who suffer fromdysphagia, and identifying the volume or characteristic of foodintake is not the focus of their work.

Aboofazeli et al. [2] present another approach to acous-tic swallow detection, achieving basic classification betweenswallows and breath sounds using a feedforward neural net-work classifier. A manual inspection of their classificationresults is performed using a spectrogram, which is a basis forthe feature extraction technique for food classification usedin our work. Makayev et al. apply spectrograms for swallowdetection using machine learning algorithms [27], though onceagain, no classification is performed, and their analysis islimited to identifying swallows. Ultimately, acoustic detectionof food intake is promising, but suffers from several seriousdrawbacks including the interference of background noise, alack of uniformity between individual eating styles, and noprior work validating the feasibility of classification betweendifferent types of food.

Several other methods for detecting swallows have beenexplored. Amft et al. [9] performs detection of eating anddrinking by identifying associated arm gestures using ac-celerometers and gyroscopes. For example, the use of cutlery,spoon, hand, and cup can be identified based on the gesturesassociated with food intake using these objects. However, the

Sensors MCU

Board RF

Module

LiPo

Battery

Fashion necklace

MCU Board

RF Module

LiPo Battery

Piezoelectric Sensor

Skin

Sensor

Technology

Swallow

Detection

Food

Classification

User Guidance

and Feedback

Fig. 2. The essential components of the system. Besides the sensor technol-ogy, the main challenges in designing the system involve swallow detectionand food classification. User guidance and feedback is also part of the systemand has the potential in being of great benefit to the user regarding improvinghealth.

eating style does not necessarily reveal the volume of foodintake, which severely limits the usefulness of this approach.Other works place electrodes on the neck and perform an EMGto identify deglutition, but the hardware is cumbersome andthe system is limited to a clinical environment [26].

Piezoelectric sensors, which produce an output voltagecorresponding with the mechanical stress applied to the bodyof the sensor, are used in countless applications. Recently,they have been applied to problems in the medical domain,such as identifying individual heart beats and respiration [25].Very few works describe attempts to use piezoelectric sensorsfor monitoring food ingestion, with several exceptions [13],though evaluation of dysphagia symptoms is the primaryobjective of their work.

In this paper, we compare the accuracy of classificationtechniques from a piezoelectric sensor worn around the neckusing statistical features extracted in the time domain to anovel spectrogram-based approach which considers time andfrequency-based components in tandem. A spectrogram, oftenused for speech recognition and other countless applications,is a visual representation of the frequency spectrum overtime generated using a short-time Fourier transform (STFT)with a fixed window size, the squared magnitude of whichyields the spectrogram. Spectrograms are used to visuallyrepresent changes in the frequency spectrum over time, havebeen applied to countless research problems pertaining tothe analysis of acoustic signals. Examples include speechrecognition, the identification of animal sounds such as whalevocalizations, and pattern recognition in genome sequences[31], [36], [12]. However, their utility in analyzing piezoelec-tric sensor data has not been adequately explored. The primarynovelty of our work is the application of spectrograms foranalysis of piezoelectric sensor data in the realm of detectionand classification of food ingestion. In this paper we collectstatistical features on a spectrogram of swallows to distinguishbetween solid and liquid foods.

III. NUTRITION MONITORING NECKLACE DESIGN

Our nutrition monitoring system comprises two main com-ponents: piezoelectric-based sensor technology, and a smart-phone application that performs data processing, user guid-ance, and feedback. The smartphone application performsswallow detection, feature extraction and classification to

Page 3: Recognition of Nutrition Intake using Time-Frequency ...web.cs.ucla.edu/~bshahbazi/publications/Nabil_necklace.pdfof the necklace, along with a small microcontroller board capable

1530-437X (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/JSEN.2015.2402652, IEEE Sensors Journal

3

detect swallows. This section describes the sensor technologyand user guidance and feedback (See Figure 5). Section IVwill further discuss the classification algorithms implementedon the smartphone. Figure 2 provides a component overviewof the system.

A. Sensor Technology

A piezoelectric sensor, also known as a vibration sensor,produces a voltage when subjected to physical strain. Byplacing a piezoelectric sensor against the throat, the musclecontraction and motion of the skin during a swallow isrepresented in the output voltage of the sensor, when sampledat frequencies as low as 5 Hz. Our necklace features a thin,lightweight piezoelectric vibration sensor attached to the insideof the necklace, along with a small microcontroller boardcapable of sampling the sensor and transmitting the data toa mobile phone via Bluetooth. The hardware is powered by alightweight lithium-polymer battery. Figure 4 depicts a subjectwearing the necklace, and further illustrates each component.

Figure 3 models the piezoelectric sensor as a nonlinearvoltage source when subject to mechanical excitation. Thedata is smoothed using a 90nF filtering capacitor, and asmall resistor brings voltage levels to the valid input rangeof the Analog/Digital converter unit (ADC). After sampling iscomplete, the data is buffered into SRAM memory, processed,and transmitted. The on-board ADC has a resolution of 10bits and can convert data at a rate of up to 257 kHz. Theoffset error, gain error, and absolute error ratings are 1, 3,and 3 LSB volts, respectively. The resolution of the ADCwas therefore approximately 15 mV, based on the supportedinput voltage range. This was sufficient for the purposes of anutrition monitoring application, as swallows were typicallyassociated with a voltage spike of 50 mV or more. Thehardware platform supports an input voltage range of 1.8-3.6volts. The lithium-polymer battery used to power the device isa 3.7 volt unregulated voltage source with a capacity of 170mAh and a maximum discharge current of 1 Ampere at roomtemperature.

Fig. 3. Circuitry diagram for the system.

The necklace is available in several varieties including asportsband suitable for athletes and other active individuals,and another targeted towards a more fashion-conscious audi-ence. Because the hardware components of the necklace are

very small and lightweight, they can be embedded in severaldifferent form factors.

Fig. 4. This figure shows a subject wearing the necklace while working. Thenecklace comprises the piezoelectric sensor, coin battery, MCU/RF board anda fashionable cover, where the sensor is designed to be in contact with theskin.

The microcontroller board samples the voltage of the vi-bration sensor at a rate of 20Hz, converting the voltage to adigital signal using the on-chip A/D converter. The data is thenbuffered and transmitted to a mobile phone. This Arduino-compatible board features a Bluetooth 4.0 LE transceiver on-board, based on the RFD22301 SMT module. The embeddedprocessor is an ARM Cortex M0 with 256kB of flash memoryand 16kB of RAM.

B. Device Battery Lifetime

The device battery lifetime depends on many parametersincluding sample rate, battery capacity, Bluetooth connectioninterval, and various algorithm parameters such as windowsize and sample rate. A CR2032 coin-cell battery typicallyhas a capacity of approximately 235mAh. Our experimentalsimulations reveal that a window size of 10 and a sample rateof 20Hz results in a power consumption of .07 mW, using theNordic nRF simulation software and a low-power MSP430microcontroller. This would correspond with a device lifetimeof over 6 months. However, the hardware platform used in thispaper is not optimized for energy-efficient applications, as thefocus is aggregating data for offline processing to evaluate ouralgorithms.

C. User Guidance and Feedback

This system includes a mobile phone application for datareporting and visualization (see Figure 5). The applicationdisplays the estimated liquid and solid volume of the currentmeal, as well as the daily and monthly total. A reporting tooldisplays alerts to the user.

The mobile application uses the Bluetooth 4.0 LE protocolto receive data from the necklace while maximizing batterylife. The data is then processed for swallow identification,classification, and analytics. To ensure user compliance, the

Page 4: Recognition of Nutrition Intake using Time-Frequency ...web.cs.ucla.edu/~bshahbazi/publications/Nabil_necklace.pdfof the necklace, along with a small microcontroller board capable

1530-437X (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/JSEN.2015.2402652, IEEE Sensors Journal

4

application is able to detect when the necklace has beenremoved, based on the observation that the average individualswallows saliva periodically. Lastly, all collected and pro-cessed data is uploaded to a secured cloud server for patienttracking and statistical analysis.

Sensors MCU

Board RF

Module

LiPo

Battery

Fashion necklace

MCU Board

RF Module

LiPo Battery

Piezoelectric Sensor

Skin

Fig. 5. Mobile application screens. The snapshot on the left shows a summaryof the users weekly food and beverage consumption. The snapshot to the rightprovides a summary of habits for the user, some positive and others negative.

IV. ALGORITHM

A. Swallow Detection

Figure 6 provides a summary of the algorithm implementedon the mobile phone, which is used to detect swallows basedon data acquired from the vibration sensor and receivedvia Bluetooth. The data is buffered locally until a sufficientnumber of samples have been acquired. Subsequently, a slidingwindow is applied to generate a new waveform representingthe standard deviation of the original data. The swallows arerepresented in the resulting waveform as peaks, while they maycorrespond to either peaks or troughs in the original data.

The algorithm then proceeds to smooth the waveform byapplying a Savitzky-Golay convolution filter to increase thesignal-to-noise ratio without distorting the signal, which yieldsclearly visible peaks representative of each original swallow,while removing noise from the signal [34]. Subsequently,the number of swallows can be identified by counting thenumber of peaks, provided there is sufficient spacing betweenswallows.

B. Spectrogram

Once a swallow is detected, a spectrogram is generatedcentered around each swallow. The spectrogram is calculatedfrom the time signal x(t), as shown in Equation 1 using theshort-time Fourier transform (STFT).

STFT{x(t)} ≡ X(n, ω) =

∞∑t=−∞

x[t]ω[t− n]e−jωt. (1)

x(t) is multiplied by a window function for a short periodof time. The data is divided into frames Fi, which overlap.Each frame is Fourier transformed, and the result is added to

Your text here

(2) Standard Deviation

Calculation via Sliding Window

(3) Smoothing and Filtering

(4) Peak Detection

(1) Data Acquisition

(2) Standard Deviation

Calculation via Sliding Window

(3) Smoothing and Filtering

(4) Peak Detection

(1) Data

Fig. 6. This figure shows the swallow detection process, whereby a slidingwindow is applied to the signal to generate a waveform representing thestandard deviation of the signal, after smoothing, peaks are detected andidentified as potential swallow regions.

a matrix that records the magnitude and phase of each pointin time and frequency.

The spectrogram is the resulting 3-dimensional plot of theenergy of the frequency content of a signal as it changes overtime [14]. For our window function, we used different values;for the Hamming window we tried lengths of w = 32, 64,and 128, with an FFT length of nfft = 32, 64, and 128, andan of overlap of 25%, 50%, 75%, and no overlap. We setthe dynamics range to 50dB. Figure 7 provides an illustrationshowing a sample swallow spectrogram for three food types(water, chips, and sandwich). Each spectrogram is defined bya matrix P ∈ Rm×k, where m is the number of bins in thetime domain, and k is the number of bins in the frequencydomain. P represents the power spectral density.

The distinguishing attributes of these piezoelectric signalsare visible. For example, chips and sandwich swallows containmore high frequency components than water swallows dueto the effect of chewing. Distinguishing between chips andsandwich swallows, though significantly more challenging, iscaptured by our statistical feature extraction methodology.

C. Feature Extraction

Once a spectrogram for each swallow is generated, we foundan optimal division of the spectrogram images into 14 binsalong the frequency domain and another 16 bins along thetime domain, for a total of 30 bins. We then calculate statisticalfeatures on each bin, to generate a feature vector Vi for eachswallow. Table I lists the main features that were calculatedfor each bin, which generates a total of s = 360 features perspectrogram swallow. We generate a matrix B which containsall n samples for each spectrogram Pi.

Page 5: Recognition of Nutrition Intake using Time-Frequency ...web.cs.ucla.edu/~bshahbazi/publications/Nabil_necklace.pdfof the necklace, along with a small microcontroller board capable

1530-437X (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/JSEN.2015.2402652, IEEE Sensors Journal

5

db db db Frequency (Hz) Frequency (Hz) Frequency (Hz)

(a) Water (b) Chips (c) Sandwich

Fig. 7. Figure a shows the spectrogram over a 3 second window of water. Figure b and Figure c show the spectrogram for chips and sandwich. The distinctionbetween water and solid is captured in the spectrogram. However, the difference between chips and sandwich is more challenging, and the statistical featurescollected from the spectrogram images is capable of learning the difference.

B = [V1, V2, ...., Vi, ..., Vn]

=

v11 · · · v1sv21 · · · v2k

.... . .

...vn1 · · · vns

(2)

where vi,j represents the jth feature of the ith sample.

TABLE IFEATURE TABLE

Mean Geometric Mean Std. Dev.Skewness Mean of Standardized Z-Scores IQRKurtosis Harmonic Mean Rank Corr.Range Median Absolute Deviation Partial Corr.

D. Feature Selection and Classification

The conventional feature selection algorithms usually focuson specific metrics to quantify the relevance and/or redun-dancy to find the smallest subset of features that providesthe maximum amount of useful information for prediction.Thus, the main goal of feature selection algorithms is toeliminate redundant or irrelevant features in a given featureset. Applying an effective feature selection algorithm not onlydecreases the computational complexity of the system byreducing the dimensionality and eliminating the redundancy,but also increases the performance of the classifier by deletingirrelevant and confusing information.

The two well-known feature selection categories are thefilter and wrapper methods. Filter methods use a specificmetric to score each individual feature (or a subset of featurestogether), and are usually fast and much less computationallyintensive. Wrapper methods usually utilize a classifier toevaluate feature subsets in an iterative manner according totheir predictive power [17]. We applied the wrapper method,testing on multiple combinations of feature subsets and classi-fiers including: kNN, Bayesian Network, Random Forest. Wereduce the dimensionality of the features from s = 360 tol, where l depends on the feature selection algorithm used,

however we limit it to l = 30. The optimal feature extraction,feature selection and classification combination is selectedto run in real time. We then divide each data sample intotraining and testing samples. Figure 8 provides an illustrationof the system architecture, where an optimal feature subsetand classifier is trained to distinguish between food types.

V. EXPERIMENTAL SETUP

Two experiments were performed to validate the efficacy ofour algorithm in accurately detecting swallows and recogniz-ing eating patterns. The first experiment involved 10 subjects,and the second experiment included an additional 10 subjects(for a total of 20 subjects). The two experiments are describedin this section.

Once a swallow is detected, we test the ability of oursystem to classify swallows using features collected from threetime-frequency based algorithms including: Matching Pursuitwith dictionaries of Gabor functions [29], Morlet Wavelet (asit is closely related to human perception, both hearing andvision) with statistical features collected from its correspond-ing scaleogram[28], and statistical features collected from aspectrogram.

To prevent bias in the classification results between eachclass label in the training set, we randomly select an equalnumber of swallows across categories. We also perform 10-fold cross validation and report the results. We test eachclassifier’s ability to distinguish between different food types.

A. Experiment 1

In the first experiment data was collected on ten subjects,two female and eight male with ages ranging between 20 and40 years of age. We placed the necklace around their neck sothat the sensor was loosely touching the skin. The necklacetightness was adjusted such that each subject was comfortablewearing the device. We placed the necklace centered betweentheir right and left clavicle right above the sternum.

Each subject consumed two types of food: a tuna, egg, orchicken sandwich on white bread, and a few pieces of Pringlespotato chips. Each subject selected which food type to starteating or drinking first. The subjects were asked to consumean 8 oz glass of water at room temperature. We ensured that

Page 6: Recognition of Nutrition Intake using Time-Frequency ...web.cs.ucla.edu/~bshahbazi/publications/Nabil_necklace.pdfof the necklace, along with a small microcontroller board capable

1530-437X (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/JSEN.2015.2402652, IEEE Sensors Journal

6

Copyright: UCLA Wireless Health Institute

1

Piezoelectric Sensor

Statistical Feature

Extraction

Feature Selection

Classification Algorithm

Labeled Data

Optimal Classifier

Piezoelectric Sensor

Statistical Feature

Extraction

Unknown Data

v1, v2,… vn vi, vi+1,… vj

Optimal Feature Set

Optimal Classifier

Training Stage

Recognition Stage

CFS Subset Evaluation

Swallow Detection

Spectrogram Generation

Swallow Detection

Spectrogram Generation

Fig. 8. System Architecture.

the portion sizes were identical from one subject to another.The subjects were then asked to push a button every time theyswallowed; this helped us further annotate the data in order toprovide truth labels for the dataset.

B. Experiment 2

In the second experiment we increased the number ofsubjects to twenty, eight female and twelve male, ages 20to 40 years. We also added hot tea as another food typeof liquid form. This enabled us to distinguish between hotand cold drinks. The subjects each consumed 8 ounces ofroom temperature water and 8 ounces of hot tea. They alsoconsumed two fun-sized snicker bars, a meat-like veggie patty,and a handful of mixed nuts.

In this experiment we compare the classifiers’ ability todistinguish between liquid and solid as well as differenttextures, temperatures, and consistencies.

VI. RESULTS AND DISCUSSION

The feature selection algorithm that consistently performedbest in combination with the classifiers was the Correlation-based feature subset selection algorithm [18]. This methodevaluates the worth of a subset of attributes by consideringthe individual predictive ability of each feature as well as theredundancy between them.

A. Experiment 1

According to our classification results, using spectrogram-based features on a signal from a piezoelectric sensor candistinguish between liquid and solid swallows with higheraccuracy than Matching Pursuit and Wavelets (See Table II)using the Random Forest Classifier (with n=100 trees), whichyielded the optimal results for all three experiments. Bestresults were achieved using a window size of 32, an FFTlength of 32, and an overlap of 50%. Spectrogram-basedfeatures consistently outperformed other methods and for thisreason we focus on spectrogram-based feature results.

TABLE IITIME-FREQUENCY DECOMPOSITION RESULTS

Precision Recall F-Measure AUCMatching Pursuit 75.9 77.6 76.7% 86.5%

Wavelet 83.9 85.9 84.9% 94.3%Spectrogram 91.2% 91.2% 91.2% 95.5%

TABLE IIICONFUSION MATRIX FOR EXPERIMENT 1 LIQUIDS VS. SOLIDS UNDER

RANDOM FOREST

Predicted OutcomeSwallow Type Liquid Solid Recall

Liquid 75 10 88.2%Solid 5 80 94.1%

Precision 93.8% 88.8%

In this experiment the Liquid class label is represented bywater, and the Solid class label is represented equally by chipsand sandwich. Using spectrogram analysis, the classifiers thatyielded the best results were using Bayesian Networks andRandom Forest Classifier (with number trees set to 100). UsingkNN, Bayesian Networks, and Random Forests we achievedweighted average F-measures (the harmonic mean of precisionand recall) of 86.1%, 83.6%, and 91.2% respectively. Weprovide the Random Forest (n=100 trees) confusion matrix inTable III. The precision for Liquids is 91.7% which is higherthan the precision for Solids, which is 90.1%, but the recallfor Liquids is less than that of Solids.

We further analyze the classification algorithms ability todistinguish between the two solid food types chips and sand-wich. The results still favor the Random Forest Classifier withan F-measure of 75.9%, outperforming kNN and BayesianNetworks by more than 5%. Table IV provides the confusionmatrix for the Random Forest Classifier, the majority of themisclassification occurs between the Chips and Sandwich classlabels. The F-measure is approximately 76.6%. As can beseen from the results, distinguishing between solids is quite

Page 7: Recognition of Nutrition Intake using Time-Frequency ...web.cs.ucla.edu/~bshahbazi/publications/Nabil_necklace.pdfof the necklace, along with a small microcontroller board capable

1530-437X (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/JSEN.2015.2402652, IEEE Sensors Journal

7

TABLE IVCONFUSION MATRIX FOR EXPERIMENT 1 DISTINGUISHING EACH

CATEGORY UNDER THE RANDOM FOREST CLASSIFIER

Predicted Outcome

Swallow Type Water Sandwich Chips Recall Liq/Sldrecall

Water 43 3 4 86.0% 86.0%Sandwich 2 39 9 78.0% 93.0%

Chips 5 12 33 66%Precision 86.0% 72.2% 71.7%

Liq/Sld 86.0% 93.0%precision

TABLE VCONFUSION MATRIX FOR EXPERIMENT 1 DISTINGUISHING EACH

CATEGORY UNDER THE RANDOM FOREST CLASSIFIER

Predicted Outcome

Swallow Type Water Sandwich Chips Recall Liq/Sldrecall

Water 10 0 0 100% 100%Sandwich 1 9 0 90.0% 95.0%

Chips 0 2 8 80%Precision 90.9% 81.8% 100%

Liq/Sld 90.9% 100%precision

challenging, due to the large variation in swallow and chewbehavior across subjects. To estimate the effects of this testrun on distinguishing between solid and liquid food types,Table IV provides the resulting Liquid/Solid precision andrecall values, which result in a 93.0% Solid precision andrecall.

While it may be challenging to distinguish a single swallowof chips and sandwich, we wanted to find out if the classi-fication algorithm could distinguish between solids given afixed period in time. We calculated the spectrogram centeredaround a swallow with a 20 second time interval. The optimalresults were provided using a window size of 128, FFTlength of 64, and an overlap of 50%. The results show greatimprovement over single swallows, as shown in Table V,

TABLE VICONFUSION MATRIX FOR EXPERIMENT 2 LIQUIDS VS. SOLIDS UNDER

RANDOM FOREST CLASSIFIER

Predicted OutcomeSwallow Type Liquid Solid Recall

Liquid 207 33 86.3%Solid 31 209 87.1%

Precision 87.0% 86.4%

TABLE VIICONFUSION MATRIX FOR EXPERIMENT 2 HOT TEA VS.WATER UNDER

RANDOM FOREST CLASSIFIER

Predicted OutcomeSwallow Type Hot Tea Water Recall

Hot Tea 92 8 92.0%Water 12 88 88.0%

Precision 88.5% 91.7%

TABLE VIIICONFUSION MATRIX FOR EXPERIMENT 2 DISTINGUISHING SOLIDS UNDER

RANDOM FOREST CLASSIFIER

Predicted OutcomeSwallow Type Nuts Chocolate Patty Recall

Nuts 36 10 4 72.0%Chocolate 5 41 4 82.0%

Patty 2 5 43 86.0%Precision 83.7% 73.2% 84.3%

where the Liquid and Solid precision is now set to 90.9% and100%, respectively, and the Liquid and Solid recall is 100%,and 95%, respectively.

B. Experiment 2

In the second experiment the classifier that provided the bestresults was also the Random Forest Classifier with numberof trees set to 100 (we tested n=10, n=50 and n=100).The resulting confusion matrix is provided in Table VI. Theresults again show the ability of the algorithm to accuratelydistinguish between liquid and solids.

The Liquid class label precision and recall of 87.0% and86.3%, respectively. The Solid class label produced a highprecision and recall of 86.4% and 87.1%, which further affirmour findings that the collected spectrogram features are gooddiscriminants between liquids and solids. The Random ForestClassifier resulted in a 86.6% F-measure compared to 77%and 78.1% F-measures of Bayesian Networks and kNN (withk=3 yielding the best results), respectively.

When testing the ability of the algorithm to distinguishbetween hot tea and water, our results show that the RandomForest Classifier resulted in a 90% F-measure, with a liquidprecision and recall of 88.5% and 92.0% respectively. Thesolid precision and recall is also high at 91.7% and 88.0%.Table VII provides the confusion matrix between the Hot Teaand Water class label. The Random Forest Classifier resulted ina 90.0% F-measure compared to 69.0% and 84.0% F-measuresof Bayesian Networks and kNN (with k=3 yielding the bestresults), respectively.

It’s interesting to note the challenge of distinguishing be-tween solids. While there are an infinite number of food types,people often maintain a regular regimen. Such a system canbecome customized based on each subjects diet. As seen inTable VIII the Random Forest Classifier consistently outper-forms other well-known classifiers, even when distinguishingbetween solids, achieving an F-measure of 80%. The BayesianNetwork and kNN classifier resulted in a 72.6% and 65.4%F-measure, respectively. Table VIII provides the confusionmatrix for the Random Forest Classifier.

VII. CONCLUSION

In this paper we performed classification of swallows usingstatistical features collected from spectrograms generated frompiezoelectric sensor signals. Our results show promise in usingspectrogram analysis in combination with piezoelectric sensorsas opposed to audio sensors. We have developed and tested a

Page 8: Recognition of Nutrition Intake using Time-Frequency ...web.cs.ucla.edu/~bshahbazi/publications/Nabil_necklace.pdfof the necklace, along with a small microcontroller board capable

1530-437X (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/JSEN.2015.2402652, IEEE Sensors Journal

8

necklace prototype which has shown the ability to successfullydistinguish between liquids and solids in two experimentsusing Random Forest Classifier with 100 trees resulting inan F-measure above 90%. We show a system and frameworkcapable of distinguishing between hot and cold drinks with anF-measure of 90%. We also show potential for distinguishingbetween solid food types with an F-measure of about 80%.Our future work intends to expand classification to differenttypes of foods, and test in more natural living environments.

REFERENCES

[1] Dietary guidelines for americans. Tech. Rep. 32877, United StatesDepartment of Agriculture, 2010.

[2] Aboofazeli, M., and Moussavi, Z. Automated classification of swal-lowing and breadth sounds. In Engineering in Medicine and BiologySociety, 2004. IEMBS ’04. 26th Annual International Conference of theIEEE, vol. 2 (Sept 2004), 3816–3819.

[3] Alshurafa, N., Eastwood, J., Nyamathi, S., Xu, W., Liu, J. J., andSarrafzadeh, M. Battery optimization in smartphones for remote healthmonitoring systems to enhance user adherence. In Proceedings of the 7thinternational conference on PErvasive Technologies Related to AssistiveEnvironments, ACM (2014), 8.

[4] Alshurafa, N., Eastwood, J.-A., Pourhomayoun, M., Nyamathi, S., Bao,L., Mortazavi, B., and Sarrafzadeh, M. Anti-cheating: Detecting self-inflicted and impersonator cheaters for remote health monitoring systemswith wearable sensors. In Wearable and Implantable Body SensorNetworks (BSN), 2014 11th International Conference on, IEEE (2014),92–97.

[5] Alshurafa, N., Kalantarian, H., Pourhomayoun, M., Sarin, S., Liu, J.,and Sarrafzadeh, M. Non-invasive monitoring of eating behavior usingspectrogram analysis in a wearable necklace. In IEEE EMBS HealthcareInnovations & Point of Care Technologies (HIPT) (2014).

[6] Alshurafa, N., Xu, W., Liu, J. J., Huang, M.-C., Mortazavi, B., Roberts,C. K., and Sarrafzadeh, M. Designing a robust activity recognitionframework for health and exergaming using wearable sensors. Biomed-ical and Health Informatics, IEEE Journal of 18, 5 (2014), 1636–1646.

[7] Alshurafa, N., Xu, W., Liu, J. J., Huang, M.-C., Mortazavi, B., Sar-rafzadeh, M., and Roberts, C. Robust human intensity-varying activityrecognition using stochastic approximation in wearable sensors. In BodySensor Networks (BSN), 2013 IEEE International Conference on, IEEE(2013), 1–6.

[8] Amft, O. Methods for detection and classification of normal swallowingfrom muscle activation and sound. In Proceedings of the First Interna-tional Conference on Pervasive Computing Technologies for Healthcare,ICST, Proceedings of the First International Conference on Pervasive C(0 2006).

[9] Amft, O., Junker, H., and Troster, G. Detection of eating and drinkingarm gestures using inertial body-worn sensors. In Wearable Computers,2005. Proceedings. Ninth IEEE International Symposium on (Oct 2005),160–163.

[10] Dong, Y., Hoover, A., and Muth, E. A device for detecting and countingbites of food taken by a person during eating. In Bioinformatics andBiomedicine, 2009. BIBM ’09. IEEE International Conference on (Nov2009), 265–268.

[11] D’Ottaviano, F. G., Linhares Filho, T. A., Andrade, H. M. T. d., Alves,P. C. L., and Rocha, M. S. G. Fiberoptic endoscopy evaluation ofswallowing in patients with amyotrophic lateral sclerosis. BrazilianJournal of Otorhinolaryngology 79 (06 2013), 349 – 353.

[12] Dugan, P., Pourhomayoun, M., Shiu, Y., Paradis, R., Rice, A., andClark, C. Using high performance computing to explore large complexbioacoustic soundscapes: Case study for right whale acoustics. ProcediaComputer Science 20, 0 (2013), 156 – 162. Complex Adaptive Systems.

[13] Ertekin, C., Yceyar, N., and Aydodu, I. Clinical and electrophysiologicalevaluation of dysphagia in myasthenia gravis. Journal of Neurology,Neurosurgery and Psychiatry 65, 6 (1998), 848–856.

[14] Flanagan, J. Speech analysis synthesis and perception. Kommunikationund Kybernetik in Einzeldarstellungen. Springer-Verlag, 1972.

[15] Fraternali, F., Rofouei, M., Alshurafa, N., Ghasemzadeh, H., Benini, L.,and Sarrafzadeh, M. Opportunistic hierarchical classification for poweroptimization in wearable movement monitoring systems. In IndustrialEmbedded Systems (SIES), 2012 7th IEEE International Symposium on,IEEE (2012), 102–111.

[16] Fund, W. C. R. Policy and action for cancer prevention. Food, nutritionand physical activity: a global perspective. Tech. rep., 2007.

[17] Guyon, I., and Elisseeff, A. An introduction to variable and featureselection. J. Mach. Learn. Res. 3 (Mar. 2003), 1157–1182.

[18] Hall, M. A. Correlation-based Feature Subset Selection for MachineLearning. PhD thesis, University of Waikato, Hamilton, New Zealand,1998.

[19] Jain, M., Howe, G. R., and Rohan, T. Dietary assessment in epidemiol-ogy: Comparison of a food frequency and a diet history questionnairewith a 7-day food record. American Journal of Epidemiology 143, 9(1996), 953–960.

[20] Kalantarian, H., Alshurafa, N., Le, T., and Sarrafzadeh, M. Monitoringeating habits using a piezoelectric sensor-based necklace. ElsevierComputers in Biology and Medicine 58, C (2015), 46–55.

[21] Kalantarian, H., Alshurafa, N., Le, T., and Sarrafzadeh, M. Non-invasivedetection of medication adherence using a digital smart necklace. InIEEE PerCom: Smart Environments Workshop (2015).

[22] Kalantarian, H., Alshurafa, N., Pourhomayoun, M., Sarin, S., Le, T.,and Sarrafzadeh, M. Spectrogram-based audio classification of nutritionintake. In IEEE EMBS Healthcare Innovations & Point of CareTechnologies (HIPT) (2014).

[23] Kalantarian, H., Alshurafa, N., Pourhomayoun, M., and Sarrafzadeh, M.Power optimization for wearable devices. In IEEE Percom: WristSense(2015).

[24] Kalantarian, H., Alshurafa, N., and Sarrafzadeh, M. A wearable nutritionmonitoring system. In IEEE Body Sensor Networks (2014).

[25] Klap, T., and Shinar, Z. Using piezoelectric sensor for continuous-contact-free monitoring of heart and respiration rates in real-life hospitalsettings. In Computing in Cardiology Conference (CinC), 2013 (Sept2013), 671–674.

[26] Limdi, A., McCutcheon, M., Taub, E., Whitehead, W., and Cook, E.W.,I. Design of a microcontroller-based device for deglutition detectionand biofeedback. In Engineering in Medicine and Biology Society,1989. Images of the Twenty-First Century., Proceedings of the AnnualInternational Conference of the IEEE Engineering in (Nov 1989), 1393–1394 vol.5.

[27] Makeyev, O., Sazonov, E., Schuckers, S., Lopez-Meyer, P., Baidyk, T.,Melanson, E., and Neuman, M. Recognition of swallowing soundsusing time-frequency decomposition and limited receptive area neuralclassifier. In Applications and Innovations in Intelligent Systems XVI,T. Allen, R. Ellis, and M. Petridis, Eds. Springer London, 2009, 33–46.

[28] Mallat, S. A Wavelet Tour of Signal Processing, Third Edition: TheSparse Way, 3rd ed. Academic Press, 2008.

[29] Mallat, S., and Zhang, Z. Matching pursuits with time-frequencydictionaries. Signal Processing, IEEE Transactions on 41, 12 (Dec1993), 3397–3415.

[30] Mattes, R. D., and Campbell, W. W. Effects of food form and timingof ingestion on appetite and energy intake in lean young adults and inyoung adults with obesity. Journal of the American Dietetic Association109, 3 (2009), 430 – 437.

[31] Mellinger, D. K., and Clark, C. W. Recognizing transient low-frequencywhale sounds by spectrogram correlation. The Journal of the AcousticalSociety of America 107, 6 (2000).

[32] Mortazavi, B. J., Pourhomayoun, M., Alsheikh, G., Alshurafa, N., Lee,S. I., and Sarrafzadeh, M. Determining the single best axis for exerciserepetition recognition and counting on smartwatches. In Wearableand Implantable Body Sensor Networks (BSN), 2014 11th InternationalConference on, IEEE (2014), 33–38.

[33] Nagae, M., and Suzuki, K. A neck mounted interface for sensingthe swallowing activity based on swallowing sound. In Engineeringin Medicine and Biology Society,EMBC, 2011 Annual InternationalConference of the IEEE (Aug 2011), 5224–5227.

[34] Savitzky, A., and Golay, M. J. E. Smoothing and differentiation ofdata by simplified least squares procedures. Analytical Chemistry 36, 8(1964), 1627–1639.

[35] Sazonov, E., and Fontana, J. A sensor system for automatic detectionof food intake through non-invasive monitoring of chewing. SensorsJournal, IEEE 12, 5 (May 2012), 1340–1348.

[36] Sussillo, D., Kundaje, A., and Anastassiou, D. Spectrogram analysis ofgenomes. EURASIP J. Appl. Signal Process. 2004 (Jan. 2004), 29–42.

[37] Yao, N., Sclabassi, R., Liu, Q., and Sun, M. A video-based algorithmfor food intake estimation in the study of obesity. In BioengineeringConference, 2007. NEBC ’07. IEEE 33rd Annual Northeast (March2007), 298–299.