An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer...
-
Upload
grace-woods -
Category
Documents
-
view
217 -
download
0
Transcript of An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer...
![Page 1: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/1.jpg)
An Introduction to Network Science and Network Data Management
Ruoming JinDepartment of Computer Science
Kent State University
![Page 2: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/2.jpg)
![Page 3: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/3.jpg)
3
Ubiquitous Networks• Complex networks are large networks where local
behavior generates non-trivial global features.
Social Networks
http://belanger.wordpress.com/2007/06/28/the-ebb-and-flow-of-social-networking/
![Page 4: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/4.jpg)
Complex Network (small world)
Stanley Milgram (1933-1984): “The man who shocked the world”
![Page 5: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/5.jpg)
5
Complex Networks in Finance
• Financial Markets
![Page 6: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/6.jpg)
![Page 7: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/7.jpg)
7
More Networks
![Page 8: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/8.jpg)
Cellular systems and biological networks
• Cellular systems are highly dynamic and responsive to environmental cues
• Biological networks– Regulatory networks– Metabolic networks– Protein-protein interaction networks
• Existing study focuses on the topological properties of the biological network– In parallel with the advancement of the complex
network study
![Page 9: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/9.jpg)
Emergence• An aggregate system is not equivalent to the sum of its parts.People’s action can contribute to ends which are no part of theirintentions. (Smith)*
• Local rules can produce emergent global behavior For example: The global match between supply and demand• There is emerging behavior in systems that escape local
explanation. More is different (Anderson)**
*Adam Smith“The Wealth of Nations” (1776)
**Phillip Anderson“More is Different”Science 177:393–396(1972)
![Page 10: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/10.jpg)
Complex Networks (Power-law)
Newman, SIAM’03
![Page 11: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/11.jpg)
Complex Networks – Clustering
• Network Clustering– Clustering coefficients –
how well connected?– What does a complex
network look like when you can really see it?
– Community discovery-separate into densely connected subsets
• Automatic discovery of communities
• Split by interest or meaning
![Page 12: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/12.jpg)
Clustering (Transitivity) coefficient
• Measures the density of triangles (local clusters) in the graph
• Two different ways to measure it:
• The ratio of the means
i
i(1)
i nodeat centered triples
i nodeat centered trianglesC
![Page 13: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/13.jpg)
Example
1
2
3
4
583
6113
C(1)
![Page 14: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/14.jpg)
Clustering (Transitivity) coefficient
• Clustering coefficient for node i
• The mean of the ratios
i nodeat centered triplesi nodeat centered triangles
Ci
i(2) C
n1
C
![Page 15: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/15.jpg)
Example
• The two clustering coefficients give different measures
• C(2) increases with nodes with low degree
1
2
3
4
5
3013
611151
C(2)
83
C(1)
![Page 16: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/16.jpg)
CS583, Bing Liu, UIC 16
Centrality• Important or prominent actors are those that
are linked or involved with other actors extensively.
• A person with extensive contacts (links) or communications with many other people in the organization is considered more important than a person with relatively fewer contacts.
• The links can also be called ties. A central actor is one involved in many ties.
![Page 17: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/17.jpg)
CS583, Bing Liu, UIC 17
Degree Centrality
![Page 18: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/18.jpg)
CS583, Bing Liu, UIC 18
Closeness Centrality
![Page 19: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/19.jpg)
CS583, Bing Liu, UIC 19
Betweenness Centrality
• If two non-adjacent actors j and k want to interact and actor i is on the path between j and k, then i may have some control over the interactions between j and k.
• Betweenness measures this control of i over other pairs of actors. Thus, – if i is on the paths of many such interactions, then
i is an important actor.
![Page 20: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/20.jpg)
CS583, Bing Liu, UIC 20
Betweenness Centrality (cont …)
• Undirected graph: Let pjk be the number of shortest paths between actor j and actor k.
• The betweenness of an actor i is defined as the number of shortest paths that pass i (pjk(i)) normalized by the total number of shortest paths.
kj jk
jk
p
ip )((4)
![Page 21: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/21.jpg)
CS583, Bing Liu, UIC 21
Betweenness Centrality (cont …)
![Page 22: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/22.jpg)
CS583, Bing Liu, UIC 22
Prestige • Prestige is a more refined measure of prominence of an
actor than centrality. – Distinguish: ties sent (out-links) and ties received (in-links).
• A prestigious actor is one who is object of extensive ties as a recipient. – To compute the prestige: we use only in-links.
• Difference between centrality and prestige: – centrality focuses on out-links – prestige focuses on in-links.
• We study three prestige measures. Rank prestige forms the basis of most Web page link analysis algorithms, including PageRank and HITS.
![Page 23: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/23.jpg)
CS583, Bing Liu, UIC 23
Degree prestige
![Page 24: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/24.jpg)
CS583, Bing Liu, UIC 24
Proximity prestige • The degree index of prestige of an actor i only considers
the actors that are adjacent to i. • The proximity prestige generalizes it by considering both
the actors directly and indirectly linked to actor i. – We consider every actor j that can reach i.
• Let Ii be the set of actors that can reach actor i. • The proximity is defined as closeness or distance of
other actors to i. • Let d(j, i) denote the distance from actor j to actor i.
![Page 25: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/25.jpg)
CS583, Bing Liu, UIC 25
Proximity prestige (cont …)
![Page 26: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/26.jpg)
CS583, Bing Liu, UIC 26
Rank prestige • In the previous two prestige measures, an important
factor is considered, – the prominence of individual actors who do the “voting”
• In the real world, a person i chosen by an important person is more prestigious than chosen by a less important person. – For example, if a company CEO votes for a person is much more
important than a worker votes for the person.
• If one’s circle of influence is full of prestigious actors, then one’s own prestige is also high. – Thus one’s prestige is affected by the ranks or statuses of the
involved actors.
![Page 27: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/27.jpg)
CS583, Bing Liu, UIC 27
Rank prestige (cont …)• Based on this intuition, the rank prestige PR(i) is define as
a linear combination of links that point to i:
![Page 28: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/28.jpg)
CS583, Bing Liu, UIC 28
HITS
• HITS stands for Hypertext Induced Topic Search.
• Unlike PageRank which is a static ranking algorithm, HITS is search query dependent.
• When the user issues a search query, – HITS first expands the list of relevant pages
returned by a search engine and – then produces two rankings of the expanded set
of pages, authority ranking and hub ranking.
![Page 29: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/29.jpg)
CS583, Bing Liu, UIC 29
Authorities and Hubs
Authority: Roughly, a authority is a page with many in-links. – The idea is that the page may have good or
authoritative content on some topic and – thus many people trust it and link to it.
Hub: A hub is a page with many out-links. – The page serves as an organizer of the information
on a particular topic and – points to many good authority pages on the topic.
![Page 30: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/30.jpg)
CS583, Bing Liu, UIC 30
Examples
![Page 31: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/31.jpg)
CS583, Bing Liu, UIC 31
The key idea of HITS• A good hub points to many good authorities, and • A good authority is pointed to by many good hubs. • Authorities and hubs have a mutual reinforcement
relationship. Fig. 8 shows some densely linked authorities and hubs (a bipartite sub-graph).
![Page 32: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/32.jpg)
CS583, Bing Liu, UIC 32
The HITS algorithm: Grab pages• Given a broad search query, q, HITS collects a
set of pages as follows:– It sends the query q to a search engine. – It then collects t (t = 200 is used in the HITS paper)
highest ranked pages. This set is called the root set W.
– It then grows W by including any page pointed to by a page in W and any page that points to a page in W. This gives a larger set S, base set.
![Page 33: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/33.jpg)
CS583, Bing Liu, UIC 33
The link graph G• HITS works on the pages in S, and assigns every page in S
an authority score and a hub score. • Let the number of pages in S be n. • We again use G = (V, E) to denote the hyperlink graph of
S. • We use L to denote the adjacency matrix of the graph.
otherwise
EjiifLij 0
),(1
![Page 34: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/34.jpg)
CS583, Bing Liu, UIC 34
The HITS algorithm
• Let the authority score of the page i be a(i), and the hub score of page i be h(i).
• The mutual reinforcing relationship of the two scores is represented as follows:
Eij
jhia),(
)()(
Eji
jaih),(
)()(
(31)
(32)
![Page 35: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/35.jpg)
CS583, Bing Liu, UIC 35
HITS in matrix form• We use a to denote the column vector with all
the authority scores, a = (a(1), a(2), …, a(n))T, and
• use h to denote the column vector with all the authority scores,
h = (h(1), h(2), …, h(n))T,• Then,
a = LTh h = La
(33)
(34)
![Page 36: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/36.jpg)
CS583, Bing Liu, UIC 36
Computation of HITS• The computation of authority scores and hub scores is
the same as the computation of the PageRank scores, using power iteration.
• If we use ak and hk to denote authority and hub vectors at the kth iteration, the iterations for generating the final solutions are
![Page 37: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/37.jpg)
CS583, Bing Liu, UIC 37
The algorithm
![Page 38: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/38.jpg)
CS583, Bing Liu, UIC 38
Relationships with co-citation and bibliographic coupling
• Recall that co-citation of pages i and j, denoted by Cij, is
– the authority matrix (LTL) of HITS is the co-citation matrix C
• bibliographic coupling of two pages i and j, denoted by Bij is
– the hub matrix (LLT) of HITS is the bibliographic coupling matrix B
ijT
n
kkjkiij LLC )(
1
LL
,)(1
ijT
n
kjkikij LLB LL
![Page 39: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/39.jpg)
CS583, Bing Liu, UIC 39
Strengths and weaknesses of HITS • Strength: its ability to rank pages according to the query
topic, which may be able to provide more relevant authority and hub pages.
• Weaknesses:– It is easily spammed. It is in fact quite easy to influence HITS
since adding out-links in one’s own page is so easy. – Topic drift. Many pages in the expanded set may not be on
topic. – Inefficiency at query time: The query time evaluation is slow.
Collecting the root set, expanding it and performing eigenvector computation are all expensive operations
![Page 40: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/40.jpg)
Complex Networks – Network Motif• Network Motifs [Uri Alon]
– Are there subgraph patterns that appear more frequently than others?
• 13 possible 3-node directed connected graphs
• Do any of these subgraphs hold special meaning for a complex network?
![Page 41: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/41.jpg)
Our Research • YesIWell (Leveraging Social Network to Spread
Health Behavior)• Backbone Discovery • Network Simplification• Role Analysis • Network Comparison • Trust in Social Network • Uncertainty
![Page 42: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/42.jpg)
Obesity, Smoking, Alcohol Assumption, Spreading in Social Network
![Page 43: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/43.jpg)
YesiWell Project (with PeaceHealth Lab., SK telcom Americas, Univ. Oregon, UNCC)
![Page 44: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/44.jpg)
![Page 45: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/45.jpg)
Network Backbone Discovery
![Page 46: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/46.jpg)
![Page 47: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/47.jpg)
Network Simplification
![Page 48: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/48.jpg)
48
Ubiquitous Network (Graph) Data
http://belanger.wordpress.com/2007/06/28/the-ebb-and-flow-of-social-networking/
• Social Network• Biological Network • Road Network/Map• WWW• Sematic Web/Ontologies• XML/RDF• ….
Semantic Search, Guha et. al., WWW’03
![Page 49: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/49.jpg)
A Fundamental Challenge• Flat Files
– No Query Support
• RDBMS– Edge Representation– SQL Recursion Support:
• Connect-By (Oracle)• Common Table Expressions (CTEs) (Microsoft)• Temporal Table
• Native Graph Database– http://en.wikipedia.org/wiki/Graph_database– Storage and Basic Operators
![Page 50: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/50.jpg)
Gray’s Law: Most Important Graph Queries
• Reachability • Shortest Path Distance• Reachability/Distance Join• Diameters • Common Neigbhors • Labeled Path/Constraint Path• Subgraph Matching• Graph Mining
– Dense subgraph/clique– Clustering– Frequent subgraph
• Matrix/Spectral Operations• …
![Page 51: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/51.jpg)
Reachability Query
1 2
3 4
6 7 8
5
9
13 10
11
12
14
15
?Query(1,11) Yes
?Query(3,9) No
The problem: Given two vertices u and v in a directed graph G, is there a path from u to v ?
Directed Graph DAG (directed acyclic graph)
![Page 52: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/52.jpg)
Reachability Applications
• XML/RDF• Biological networks• Ontology• WWW• Social Network• Logical programming (Lattice operation)• Object programming (Class relationship)• Distributed Systems (Reachable states)
![Page 53: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/53.jpg)
Reachability Index Tradeoff • Query Time• Index Size• Construction Cost• Two Basic Approaches
– Online DFS/BFS • O(n+m), O(n+m), O(n+m)• Best online Search is still at least one order of magnitude
slower than the indexing methods!
– Fully Materialized Transitive Closure• O(1), O(n2), O(nm)/O(n3)
![Page 54: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/54.jpg)
Method Query time Construction Index size
Optimal Chain Cover (Jagadish, TODS’90)
O(k) O(nm) O(nk)
Optimal Tree Cover(Agrawal et al., SIGMOD’89)
O(n) O(nm) O(n2)
Dual-Labeling(Wang et al., ICDE’06)
O(1) O(n+m+t3) O(n+t2)
Labeling+SSPI(Chen et al., VLDB’05)
O(m-n) O(n+m) O(n+m)
GRIPP(Triβl et al., SIGMOD’07)
O(m-n) O(n+m) O(n+m)
Path-Tree(Jin, et al., SIGMOD’08)
log2k’ O(mk’)/O(mn) O(nk’)
2-HOP (SODA 2002)O(nm1/2)
(conjecture)O(n3|TC|)
O(m1/2)(conjecture)
3-HOP (Jin, et al., SIGMOD’09) O(kn2|contour|) O(mklogn) O(logn+k)
Existing Work
When graphs are denser, the size or the compressed transitive closure grows very large; Expensive construction cost!
![Page 55: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/55.jpg)
Distance Query
1 2
3 4
6 7 8
5
9
13 10
11
12
14
15
?Distance(1,11) 3
?Distance(3,9) (-1)
The problem: Given two vertices u and v in a directed graph G, what is the length of shortest path from u to v ?
![Page 56: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/56.jpg)
Label-Constraint Reachability (SIGMOD’10)
1
4
7
10
0
2
5
8
11
3
6
9
12
b
c
ece
13 14
15
d
ab
cb d
a
c
a
b
a
b
b
b
b
d
a
a
e
bb
a
Q1: Can vertex 0 reach 9 only through edge labels { a,b,c } ?
Yes
Can vertex 0 reach 9 only through edge labels { a,b } ?
No
Given vertices u and v in a labeled graph G and a label set A, is there a path from u to v with edge labels in A?
![Page 57: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/57.jpg)
Label-Constraint Reachability (LCR)
• Label-Constraint Reachability Query: Can u reach v through a path whose edge labels must satisfy certain membership constraints?
• The path’s edge labels must be in the set of constraint labels
• Social Networks: Whether person A is a remote relative of B (Is there a path from A to B where each edge label is one of parent-of, brother-of, sister-of?)
• Metabolic Network: Is there a pathway between two compounds which can be activated under certain conditions (a set of enzymes)?
![Page 58: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/58.jpg)
Depth First Search
1
4
7
10
0
2
5
8
11
3
6
9
12
b
c
ece
13
14
15
d
ab
cb d
a
c
a
b
a
b
b
b
b
d
a
a
e
bb
a
Can vertex 0 reach 9 only through edge labels { a,b,c } ?
0
3
6
Result: YesComplexity: O(|V|+|E|)
May speedup with “focused” procedure using traditional index
![Page 59: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/59.jpg)
Managing Graph Data
• Reachability and distance queries are some of the most important and frequently used queries in graph database and they are also the basic operators for solving more complex graph queries
• Constructing indices (with fast query time, small index size, and reasonable construction cost) is an important research problem!
• 3-HOP approach provides a unified framework in addressing the challenge!
![Page 61: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/61.jpg)
Semantic Queries
• President Crime• North Atlantic Tempature• Chinese Computer Scientists USA• Coaches Kent State
• "Jeopardy!"'s Man vs. Machine Match• Question Answering
![Page 62: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/62.jpg)
![Page 63: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/63.jpg)
![Page 64: An Introduction to Network Science and Network Data Management Ruoming Jin Department of Computer Science Kent State University.](https://reader035.fdocuments.in/reader035/viewer/2022062422/56649eaa5503460f94bb021b/html5/thumbnails/64.jpg)
Thanks!!! Questions?