Geo-Friends Recommendation in GPS-Based Cyber-Physical ...€¦ · • Mobile devices: Very...
Transcript of Geo-Friends Recommendation in GPS-Based Cyber-Physical ...€¦ · • Mobile devices: Very...
![Page 1: Geo-Friends Recommendation in GPS-Based Cyber-Physical ...€¦ · • Mobile devices: Very popular, a major media of communication • Data from mobile devices (like real time GPS](https://reader034.fdocuments.in/reader034/viewer/2022051810/6015ccd4e1b3dd30591e4f14/html5/thumbnails/1.jpg)
Data and Information Systems Laboratory University of Illinois Urbana-Champaign
ASONAM 2011 July, 2011
Geo-Friends Recommendation in GPS-Based Cyber-Physical
Social Network
Xiao Yu, Ang Pan, Lu-An Tang, Zhenhui Li, Jiawei Han
University of Illinois at Urbana-Champaign
Acknowledgements: NSF, ARL, NASA, AFOSR (MURI), IBM & Boeing
![Page 2: Geo-Friends Recommendation in GPS-Based Cyber-Physical ...€¦ · • Mobile devices: Very popular, a major media of communication • Data from mobile devices (like real time GPS](https://reader034.fdocuments.in/reader034/viewer/2022051810/6015ccd4e1b3dd30591e4f14/html5/thumbnails/2.jpg)
Data and Information Systems Laboratory University of Illinois Urbana-Champaign
ASONAM 2011 July, 2011 2
Roadmap • Motivation • Background and Preliminaries • Geo-friend Finding Framework
• GPS Pattern Extraction • Build Pattern-Based Information Network • Random Walk with Restart on
Heterogeneous Information Network • Experiments • Conclusions
![Page 3: Geo-Friends Recommendation in GPS-Based Cyber-Physical ...€¦ · • Mobile devices: Very popular, a major media of communication • Data from mobile devices (like real time GPS](https://reader034.fdocuments.in/reader034/viewer/2022051810/6015ccd4e1b3dd30591e4f14/html5/thumbnails/3.jpg)
Data and Information Systems Laboratory University of Illinois Urbana-Champaign
ASONAM 2011 July, 2011 3
Motivation: Popularity of Mobile Devices • Mobile devices: Very popular, a major media of
communication
• Data from mobile devices (like real time GPS location, moving trajectories): Reflect users’ daily activities and real life social interactions
• Social network services: Allow users to store and share locations and trajectories collected from their mobile devices
A List of Major Location-Based Social Network Services
Foursquare Facebook Place Google Latitude Twitter Location Update
Yelp Check-in Google+ ……
![Page 4: Geo-Friends Recommendation in GPS-Based Cyber-Physical ...€¦ · • Mobile devices: Very popular, a major media of communication • Data from mobile devices (like real time GPS](https://reader034.fdocuments.in/reader034/viewer/2022051810/6015ccd4e1b3dd30591e4f14/html5/thumbnails/4.jpg)
Data and Information Systems Laboratory University of Illinois Urbana-Champaign
ASONAM 2011 July, 2011 4
Motivation: Geo-Friends Recommendation
• Social network with data collected from sensors is usually referred as Cyber-Physical Social Network
• Problem to be solved: Friend recommendation in GPS-based cyber-physical social networks, by combining GPS data with social network information
• Our method discovers real life friends on web-based social network
• Geo-Friends: Potential real life friends, who have both social similarities and geographical correlation
![Page 5: Geo-Friends Recommendation in GPS-Based Cyber-Physical ...€¦ · • Mobile devices: Very popular, a major media of communication • Data from mobile devices (like real time GPS](https://reader034.fdocuments.in/reader034/viewer/2022051810/6015ccd4e1b3dd30591e4f14/html5/thumbnails/5.jpg)
Data and Information Systems Laboratory University of Illinois Urbana-Champaign
ASONAM 2011 July, 2011 5
A Geo-Friend Finding Example
• Real life friends play an important role in off-line social events while most virtual on-line friends can fulfill such social function
Alex needs geo-friends join him in a local charity
event
Bob is college friend who lives in another
country now
Carlos is a co-worker but no social network
similarity with Alex
David shares common friends and goes to
same gym, same game store with Alex
David is more likely to be Alex’s geo-friend, but we cannot get this information by only analyzing social network or GPS data.
![Page 6: Geo-Friends Recommendation in GPS-Based Cyber-Physical ...€¦ · • Mobile devices: Very popular, a major media of communication • Data from mobile devices (like real time GPS](https://reader034.fdocuments.in/reader034/viewer/2022051810/6015ccd4e1b3dd30591e4f14/html5/thumbnails/6.jpg)
Data and Information Systems Laboratory University of Illinois Urbana-Champaign
ASONAM 2011 July, 2011 6
Contribution • Propose a geo-friend recommendation problem, and
discuss the differences from previously studied link prediction problem
• Define and generate a set of GPS patterns to describe people’s real life social interaction and correlation
• Propose a random walk-based statistical framework for geo-friend recommendation
• Design and conduct a series of experiments on both synthetic and real-world datasets
• Demonstrate the power of our method in various situations
![Page 7: Geo-Friends Recommendation in GPS-Based Cyber-Physical ...€¦ · • Mobile devices: Very popular, a major media of communication • Data from mobile devices (like real time GPS](https://reader034.fdocuments.in/reader034/viewer/2022051810/6015ccd4e1b3dd30591e4f14/html5/thumbnails/7.jpg)
Data and Information Systems Laboratory University of Illinois Urbana-Champaign
ASONAM 2011 July, 2011 7
Roadmap • Motivation • Background and Preliminaries • Geo-friend Finding Framework
• GPS Pattern Extraction • Build Pattern-Based Information Network • Random Walk with Restart on
Heterogeneous Information Network • Experiments • Conclusions
![Page 8: Geo-Friends Recommendation in GPS-Based Cyber-Physical ...€¦ · • Mobile devices: Very popular, a major media of communication • Data from mobile devices (like real time GPS](https://reader034.fdocuments.in/reader034/viewer/2022051810/6015ccd4e1b3dd30591e4f14/html5/thumbnails/8.jpg)
Data and Information Systems Laboratory University of Illinois Urbana-Champaign
ASONAM 2011 July, 2011 8
Data Model • GPS Trajectory: Sequentially connecting GPS records
of a particular user, following the ascending order of timestamps
• GPS-Based Cyber Physical Social Network:
G(S, V, E): • V: Set of people in the
network
• E: Set of edges, represents all the links between people
• S: Set of GPS trajectories associated with people
![Page 9: Geo-Friends Recommendation in GPS-Based Cyber-Physical ...€¦ · • Mobile devices: Very popular, a major media of communication • Data from mobile devices (like real time GPS](https://reader034.fdocuments.in/reader034/viewer/2022051810/6015ccd4e1b3dd30591e4f14/html5/thumbnails/9.jpg)
Data and Information Systems Laboratory University of Illinois Urbana-Champaign
ASONAM 2011 July, 2011 9
Problem Definition
• Given G(S, V, E), and a particular query posed by person v∗
• Return a ranked list of people nodes in V and also for each element v′ in the list:
• What’s more, the ranking score in the process should be relevant to both GPS trajectory S and social network (V, E)
Evv >∉< '*,
![Page 10: Geo-Friends Recommendation in GPS-Based Cyber-Physical ...€¦ · • Mobile devices: Very popular, a major media of communication • Data from mobile devices (like real time GPS](https://reader034.fdocuments.in/reader034/viewer/2022051810/6015ccd4e1b3dd30591e4f14/html5/thumbnails/10.jpg)
Data and Information Systems Laboratory University of Illinois Urbana-Champaign
ASONAM 2011 July, 2011 10
Roadmap • Motivation • Background and Preliminaries • Geo-friend Finding Framework
• GPS Pattern Extraction • Build Pattern-Based Information Network • Random Walk with Restart on
Heterogeneous Information Network • Experiments • Conclusions
![Page 11: Geo-Friends Recommendation in GPS-Based Cyber-Physical ...€¦ · • Mobile devices: Very popular, a major media of communication • Data from mobile devices (like real time GPS](https://reader034.fdocuments.in/reader034/viewer/2022051810/6015ccd4e1b3dd30591e4f14/html5/thumbnails/11.jpg)
Data and Information Systems Laboratory University of Illinois Urbana-Champaign
ASONAM 2011 July, 2011 11
Geo-Friends Finding Framework: 3 Steps • GPS pattern extraction
• Convert raw, noisy GPS data to meaningful and representative GPS patterns
• Pattern-based heterogeneous information network building
• Combine geographical and social information together in one network
• Random walk with restart on the network
• Use random walk score to measure similarity between people vertices
![Page 12: Geo-Friends Recommendation in GPS-Based Cyber-Physical ...€¦ · • Mobile devices: Very popular, a major media of communication • Data from mobile devices (like real time GPS](https://reader034.fdocuments.in/reader034/viewer/2022051810/6015ccd4e1b3dd30591e4f14/html5/thumbnails/12.jpg)
Data and Information Systems Laboratory University of Illinois Urbana-Champaign
ASONAM 2011 July, 2011 12
GPS Pattern Extraction
• Based on empirical observations and heuristics, we propose four different GPS patterns to capture these information
• First, convert raw GPS trajectory dataset S to categorical dataset Scat , and sequential dataset Sseq
• Scat : Discard temporal information and keep discretized locations in an unordered manner
• Sseq : Locations are sequentially connected by the order of timestamps
![Page 13: Geo-Friends Recommendation in GPS-Based Cyber-Physical ...€¦ · • Mobile devices: Very popular, a major media of communication • Data from mobile devices (like real time GPS](https://reader034.fdocuments.in/reader034/viewer/2022051810/6015ccd4e1b3dd30591e4f14/html5/thumbnails/13.jpg)
Data and Information Systems Laboratory University of Illinois Urbana-Champaign
ASONAM 2011 July, 2011 13
FL-Pattern • FL-Pattern: Closed frequent patterns with support ≥
2 in Scat is defined as Frequent Location Patterns
• Frequent patterns in Scat could be generated using FP-Growth
• Heuristic: GPS locations can reflect people’s interests, and people tend to go to their interest-related locations more often
• If two people share common locations, which suggests they might share common interests, the probability that they become friends would be higher.
![Page 14: Geo-Friends Recommendation in GPS-Based Cyber-Physical ...€¦ · • Mobile devices: Very popular, a major media of communication • Data from mobile devices (like real time GPS](https://reader034.fdocuments.in/reader034/viewer/2022051810/6015ccd4e1b3dd30591e4f14/html5/thumbnails/14.jpg)
Data and Information Systems Laboratory University of Illinois Urbana-Champaign
ASONAM 2011 July, 2011 14
FT-Pattern
• FT-Pattern: Closed sequential pattern with support ≥ 2 and length ≥ 2 in Sseq is Frequent Trajectory Pattern • Sequential Patterns in Sseq could be generated
using PrefixSpan • Heuristic: : GPS trajectory segments indicate people’s
habits and routines • People who share similar routines, tend to
become friends
![Page 15: Geo-Friends Recommendation in GPS-Based Cyber-Physical ...€¦ · • Mobile devices: Very popular, a major media of communication • Data from mobile devices (like real time GPS](https://reader034.fdocuments.in/reader034/viewer/2022051810/6015ccd4e1b3dd30591e4f14/html5/thumbnails/15.jpg)
Data and Information Systems Laboratory University of Illinois Urbana-Champaign
ASONAM 2011 July, 2011 15
FLT-Pattern • FLT-Pattern: For each FL-Pattern, if locations share
the same timestamp in all corresponding GPS trajectories, and no super-pattern with the same support can be generated by adding another time constrained location, this pattern is a Frequent Location with Time Constraint Pattern
• Heuristic:
• If two people share same locations at the same timestamps in their GPS trajectory, they should be geographically related.
![Page 16: Geo-Friends Recommendation in GPS-Based Cyber-Physical ...€¦ · • Mobile devices: Very popular, a major media of communication • Data from mobile devices (like real time GPS](https://reader034.fdocuments.in/reader034/viewer/2022051810/6015ccd4e1b3dd30591e4f14/html5/thumbnails/16.jpg)
Data and Information Systems Laboratory University of Illinois Urbana-Champaign
ASONAM 2011 July, 2011 16
FTT-Pattern
• FTT-Pattern: Similarly to FLT-Pattern, Frequent Trajectory with Time Constraint Pattern can be defined as closed sequential pattern with support ≥ 2 and length ≥ 2 in Sseq and it shares the same time period in corresponding GPS trajectories
• Heuristic: Two people share same routine in a specific time period, which indicates they are hanging out in that time period
• If two people hang out, the probability of they becoming geo-friends would be higher
![Page 17: Geo-Friends Recommendation in GPS-Based Cyber-Physical ...€¦ · • Mobile devices: Very popular, a major media of communication • Data from mobile devices (like real time GPS](https://reader034.fdocuments.in/reader034/viewer/2022051810/6015ccd4e1b3dd30591e4f14/html5/thumbnails/17.jpg)
Data and Information Systems Laboratory University of Illinois Urbana-Champaign
ASONAM 2011 July, 2011 17
Pattern-Based Social Network
• Build a pattern-based heterogeneous information network by combining GPS patterns and social network structures
• Given G(S, V, E), first discard raw GPS trajectory set S
• Then for each GPS pattern, create an additional node p, and link corresponding person node v with p if this GPS pattern exists in person v’s GPS trajectory history
![Page 18: Geo-Friends Recommendation in GPS-Based Cyber-Physical ...€¦ · • Mobile devices: Very popular, a major media of communication • Data from mobile devices (like real time GPS](https://reader034.fdocuments.in/reader034/viewer/2022051810/6015ccd4e1b3dd30591e4f14/html5/thumbnails/18.jpg)
Data and Information Systems Laboratory University of Illinois Urbana-Champaign
ASONAM 2011 July, 2011 18
Pattern-Based Social Network (2) • Create a new edge <v, p>, and add it to E′. Set E′ in
contains three types of edges: edges between people, edges from person nodes to pattern nodes, and edges from pattern nodes to person nodes.
![Page 19: Geo-Friends Recommendation in GPS-Based Cyber-Physical ...€¦ · • Mobile devices: Very popular, a major media of communication • Data from mobile devices (like real time GPS](https://reader034.fdocuments.in/reader034/viewer/2022051810/6015ccd4e1b3dd30591e4f14/html5/thumbnails/19.jpg)
Data and Information Systems Laboratory University of Illinois Urbana-Champaign
ASONAM 2011 July, 2011 19
Pattern Refinement
• Adding a large number of GPS patterns without selection may decrease the performance badly • Common locations contains no social similarity, e.g.,
bus stop, and hospital • Instead of manually refining patterns, we employ an
entropy-based thresholding measure* to refine and select discriminative GPS patterns • This method filter out patterns with high frequency
and low length * J.N. Kapur, P.K. Sahoo and A.K.C. Wong. A new method for gray-level picture
thresholding using the entropy of the histogram In Computer Vision, Graphics, and Image Processing, March 1985.
![Page 20: Geo-Friends Recommendation in GPS-Based Cyber-Physical ...€¦ · • Mobile devices: Very popular, a major media of communication • Data from mobile devices (like real time GPS](https://reader034.fdocuments.in/reader034/viewer/2022051810/6015ccd4e1b3dd30591e4f14/html5/thumbnails/20.jpg)
Data and Information Systems Laboratory University of Illinois Urbana-Champaign
ASONAM 2011 July, 2011 20
Edge Weights: between Pattern Nodes and Person Nodes
• After the construction of the heterogeneous information network, edge weights between nodes need to be defined
• From different types of GPS pattern nodes to person nodes
Nbp(v) is the set of pattern nodes
length(p) denotes the length of pattern p
timespan(p) denotes time span of a time constraint pattern p
Parameters α, β, γ and θ controls pattern importance
![Page 21: Geo-Friends Recommendation in GPS-Based Cyber-Physical ...€¦ · • Mobile devices: Very popular, a major media of communication • Data from mobile devices (like real time GPS](https://reader034.fdocuments.in/reader034/viewer/2022051810/6015ccd4e1b3dd30591e4f14/html5/thumbnails/21.jpg)
Data and Information Systems Laboratory University of Illinois Urbana-Champaign
ASONAM 2011 July, 2011 21
Edge Weights (2) • From pattern nodes to person nodes
• Nbv(p) denotes the set of person nodes connecting to pattern node p
• From person nodes to person nodes
• Nbv(v) denotes the set of person nodes connected to person node v
![Page 22: Geo-Friends Recommendation in GPS-Based Cyber-Physical ...€¦ · • Mobile devices: Very popular, a major media of communication • Data from mobile devices (like real time GPS](https://reader034.fdocuments.in/reader034/viewer/2022051810/6015ccd4e1b3dd30591e4f14/html5/thumbnails/22.jpg)
Data and Information Systems Laboratory University of Illinois Urbana-Champaign
ASONAM 2011 July, 2011 22
Transition Matrix • In order to apply random walk with restart on the
network, we need to convert network into a transition matrix and then normalize edge weights of pattern nodes • Pr(V) is an |V| × |V| matrix representing the transition
probability between person nodes to person nodes • Pr(A) is a |P|× |V| matrix representing the transition
probability from GPS pattern nodes to person nodes
• Pr(B) is a |V| × |P| matrix representing the transition probability from person nodes to GPS pattern nodes
![Page 23: Geo-Friends Recommendation in GPS-Based Cyber-Physical ...€¦ · • Mobile devices: Very popular, a major media of communication • Data from mobile devices (like real time GPS](https://reader034.fdocuments.in/reader034/viewer/2022051810/6015ccd4e1b3dd30591e4f14/html5/thumbnails/23.jpg)
Data and Information Systems Laboratory University of Illinois Urbana-Champaign
ASONAM 2011 July, 2011 23
Why Choose Random Walk with Restart • Random Walk with Restart can simulate the following aspects of
friend finding in GPS-based social network • If a GPS pattern contains more geographical information, the
in-coming probability from person nodes to this pattern should be higher, which increases the probability from one person to another via this GPS pattern
• If two people share more GSP patterns, the overall probability for one person link to another via these GPS pattern nodes would be higher
• If one GPS pattern is rare, the out-going probability of this node would be larger, so that people connected to this pattern would have a higher probability to be linked together
![Page 24: Geo-Friends Recommendation in GPS-Based Cyber-Physical ...€¦ · • Mobile devices: Very popular, a major media of communication • Data from mobile devices (like real time GPS](https://reader034.fdocuments.in/reader034/viewer/2022051810/6015ccd4e1b3dd30591e4f14/html5/thumbnails/24.jpg)
Data and Information Systems Laboratory University of Illinois Urbana-Champaign
ASONAM 2011 July, 2011 24
Random Walk with Restart
• Denote the query person as v∗. The random walk process can be represented as:
• RN is a vector, that represents the link relevance from all the nodes to query person v*
• R(t)N represents the link relevance of each node at
the tth iteration
• We assign R(0)N(v*) = 1 where v* is the query
nodes, and all the other elements to 0
![Page 25: Geo-Friends Recommendation in GPS-Based Cyber-Physical ...€¦ · • Mobile devices: Very popular, a major media of communication • Data from mobile devices (like real time GPS](https://reader034.fdocuments.in/reader034/viewer/2022051810/6015ccd4e1b3dd30591e4f14/html5/thumbnails/25.jpg)
Data and Information Systems Laboratory University of Illinois Urbana-Champaign
ASONAM 2011 July, 2011 25
Roadmap • Motivation • Background and Preliminaries • Geo-friend Finding Framework
• GPS Pattern Extraction • Build Pattern-Based Information Network • Random Walk with Restart on
Heterogeneous Information Network • Experiments • Conclusions
![Page 26: Geo-Friends Recommendation in GPS-Based Cyber-Physical ...€¦ · • Mobile devices: Very popular, a major media of communication • Data from mobile devices (like real time GPS](https://reader034.fdocuments.in/reader034/viewer/2022051810/6015ccd4e1b3dd30591e4f14/html5/thumbnails/26.jpg)
Data and Information Systems Laboratory University of Illinois Urbana-Champaign
ASONAM 2011 July, 2011 26
Datasets
• We generate 4 synthetic datasets with different sizes, attributes and distributions in order to cover different scenarios and thoroughly test our framework
• Also, apply our method on MIT Reality Mining dataset
![Page 27: Geo-Friends Recommendation in GPS-Based Cyber-Physical ...€¦ · • Mobile devices: Very popular, a major media of communication • Data from mobile devices (like real time GPS](https://reader034.fdocuments.in/reader034/viewer/2022051810/6015ccd4e1b3dd30591e4f14/html5/thumbnails/27.jpg)
Data and Information Systems Laboratory University of Illinois Urbana-Champaign
ASONAM 2011 July, 2011 27
Competitor Methods
• Random: random selection
• Same Edge: choose friends based on number of same friends
• GPS Similarity: choose friends by measuring GPS location and trajectory similarity
• Random Walk without GPS Patterns: Recommend friends by applying random walk with restart on the original social network
• Bluetooth (only MIT dataset): Recommend friends by returning people who share high meeting frequency
![Page 28: Geo-Friends Recommendation in GPS-Based Cyber-Physical ...€¦ · • Mobile devices: Very popular, a major media of communication • Data from mobile devices (like real time GPS](https://reader034.fdocuments.in/reader034/viewer/2022051810/6015ccd4e1b3dd30591e4f14/html5/thumbnails/28.jpg)
Data and Information Systems Laboratory University of Illinois Urbana-Champaign
ASONAM 2011 July, 2011 28
Performance (1)
gpsnet120 precision gpsnet120 recall
![Page 29: Geo-Friends Recommendation in GPS-Based Cyber-Physical ...€¦ · • Mobile devices: Very popular, a major media of communication • Data from mobile devices (like real time GPS](https://reader034.fdocuments.in/reader034/viewer/2022051810/6015ccd4e1b3dd30591e4f14/html5/thumbnails/29.jpg)
Data and Information Systems Laboratory University of Illinois Urbana-Champaign
ASONAM 2011 July, 2011 29
Performance (2)
Mit dataset precision Mit dataset recall
![Page 30: Geo-Friends Recommendation in GPS-Based Cyber-Physical ...€¦ · • Mobile devices: Very popular, a major media of communication • Data from mobile devices (like real time GPS](https://reader034.fdocuments.in/reader034/viewer/2022051810/6015ccd4e1b3dd30591e4f14/html5/thumbnails/30.jpg)
Data and Information Systems Laboratory University of Illinois Urbana-Champaign
ASONAM 2011 July, 2011 30
Performance (3)
gpsnet120 dataset precision-recall curve MIT dataset precision-recall curve
Precision and recall curve between Random Walk with Restart without GPS information and our method
Please refer to the paper for more experiment results and analysis
![Page 31: Geo-Friends Recommendation in GPS-Based Cyber-Physical ...€¦ · • Mobile devices: Very popular, a major media of communication • Data from mobile devices (like real time GPS](https://reader034.fdocuments.in/reader034/viewer/2022051810/6015ccd4e1b3dd30591e4f14/html5/thumbnails/31.jpg)
Data and Information Systems Laboratory University of Illinois Urbana-Champaign
ASONAM 2011 July, 2011 31
Roadmap • Motivation • Background and Preliminaries • Geo-friend Finding Framework
• GPS Pattern Extraction • Build Pattern-Based Information Network • Random Walk with Restart on
Heterogeneous Information Network • Experiments • Conclusions
![Page 32: Geo-Friends Recommendation in GPS-Based Cyber-Physical ...€¦ · • Mobile devices: Very popular, a major media of communication • Data from mobile devices (like real time GPS](https://reader034.fdocuments.in/reader034/viewer/2022051810/6015ccd4e1b3dd30591e4f14/html5/thumbnails/32.jpg)
Data and Information Systems Laboratory University of Illinois Urbana-Champaign
ASONAM 2011 July, 2011 32
Conclusions
• Propose a problem of identifying geographically related friends, and also a three-step statistical framework which combines geo-information with social analysis
• Future work • Domain-oriented GPS pattern definition • Friends recommendation based on user and
his/her interests • Real time friend recommendation by tracking user
GPS usage on the fly