Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity...
Transcript of Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity...
![Page 1: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/1.jpg)
Future Trends in Data AnalyticsHannes Voigt
![Page 2: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/2.jpg)
2
General Information
Lecture Notes! Further information on our website http://wwwdb.inf.tu-dresden.de
![Page 3: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/3.jpg)
3
Dresden Database System Group
![Page 4: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/4.jpg)
4
…, ! face detection & recogition, autobeam, Siri/Cortant/…, 3D printing, ...
Future is Now
[http://www.bostondynamics.com/][http://[https://www.google.com/selfdrivingcar/] [http://www.idsc.ethz.ch/research-dandrea/research-projects/cubli.html]
Cubli
[http://www.ibmwatson.com/]
![Page 5: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/5.jpg)
5
… tomorrow?
Sooner then you think!
![Page 6: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/6.jpg)
6
When have we entered the future?
What was the threshold
we crossed?
![Page 7: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/7.jpg)
Exponential Growth
![Page 8: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/8.jpg)
8
Exponential Growth
Outnumbers everything else quicklyAsymptotic advantage!Quickly increasing add! -on
Leads to surprising results! Black swans in economic crisis! Shooting stars in business, media, sport, etc.
[http://roadmap2retire.com/2016/06/disruptive-technologies-exponential-growth/]
[http://xkcd.com/1727/]
![Page 9: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/9.jpg)
9
Second Half of the Chessboard
When exponential growth really kicks in According to Ray ! Kurzweil Things start to get interesting in the second !half of the chess boardBeyond 4G numbers quickly go beyond !human intuitionWhat happens in the second half can hardly !foreseen
[https://en.wikipedia.org/wiki/File:Wheat_Chessboard_with_line.svg]
![Page 10: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/10.jpg)
10
First half of theChessboard
Welcome to the Second Half of the Chessboard
![Page 11: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/11.jpg)
Digitization
![Page 12: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/12.jpg)
12
Everything is Digital
12
![Page 13: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/13.jpg)
13
Landscape has changed
From Islands of digital data … … to ponds of analog signals
(Tuamotu Archipelago, French Polynesia) (Algonquin Provincial Park, Ontario, Canada)
![Page 14: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/14.jpg)
14
Everything is Digital
World’s capacity to store information World’s capacity to telecommunicate
Analog
Analog
DigitalDigital
[M. Hilbert and P. Lopez, The World's Technological Capacity to Store, Communicate, and Compute Information, Science, 332, April 2011, DOI: 10.1126/science.1200970]
94% digital in 2007 99% digital in 2007
![Page 15: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/15.jpg)
15
Everything is Digital
[http://blog.acronis.com/posts/data-everything-8-noble-truths]
Everything is DigitalEverything is Digital
![Page 16: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/16.jpg)
Big Data
![Page 17: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/17.jpg)
17
Big Data term reflects the three drivers
Exponential growth! beyond intuition! second half of the chessboard
Digitization Everything is digital data!Analog signals are not part of big data!
Big Data
![Page 18: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/18.jpg)
18[http://www.ibmbigdatahub.com/infographic/four-vs-big-data]
![Page 19: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/19.jpg)
19
Data, data, everywhere… Volume
The Petabyte Age! 2008
! Eric Schmidt (in 2010): Every 2 Days We Create As Much Information As We Did Up To 2003
The Zettabyte Age2015!One ! Zettabyte = Stack of books from Earth to Pluto 20 times
The internet in 2020~26.3 ! billionnetworked devices25.1 GB ! averagetraffic per capitaper month2.3 Zettabytes !annual IP-Traffic
![Page 20: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/20.jpg)
20
Data is produced continuously Velocity
Humane-produced data! ~300 million email sent/received per minute in 2016! >0.4 million tweets per minute in 2016! ~138 million google searches per minute in 2016! >400 hours of video was uploaded to YouTube per minute in 2016
Machine produced data! IoT will be boost data velocity greatly! Sensors become ubiquitous! Sensors for sound, images, position, motion, temperature, pressure, etc …! Resolution (in time and space) is continuously increasing
! Assume Waze like cars collecting 20 double values every second! With one million driving cars that is almost 10 TB every minute
![Page 21: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/21.jpg)
21
Data from all kinds of digital sources! Structure vs. unstructured! Text vs. image! Curated vs. automatically collected! Raw vs. edited vs. refined
Semantic heterogenity! Decentralized content generation! Multiple perspectives (conceptualizations) of
the reality! Ambiguity, vagueness, inconsistency
Everything is data Variety
“A lot of Big Data is a lot of small “A lot of Big Data is a lot of small data data put put put togethertogether.”
![Page 22: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/22.jpg)
22
Data is messy Veracity
Quality of captured data varies greatlySensor inaccuracy!Human mistakes!Incompleteness!Untrusted sources!Deterministic processes!
! etc.
22
“Next to the analytes, we see everything in the results, from the perfume of the lab assistant to the softener in the new machine sitting next.” —About Liquid chromatography–mass spectrometry at IPB Halle
![Page 23: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/23.jpg)
Data Science/Data Analysis
“Data is the new oil. It’s valuable, but if unrefined it cannot really be used. It has to be changed into gas, plastic, chemicals, etc. to create a valuable entity that drives profitable activity; so must data be broken down, analyzed for it to have value.” –Clive Humby
Value – the fifth V of Big Data
… or how to turn raw data into something valuable?
![Page 24: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/24.jpg)
24
Levels of Analysis
Reporting
Analytics
![Page 25: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/25.jpg)
25
Descriptive Analytics
How have I done? (and why?)Simplest ! class of analyticsCondense ! big data into smaller, more useful bits of informationSummary ! of what happened70! -80% penetration
ExamplesDatabase aggregation queries!Business reporting (e.g. ! Sales figures)Market survey (e.g. GFK)!(classical) business intelligence, dashboards, scorecards!Google Analytics!
![Page 26: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/26.jpg)
26
Predictive Analytics
How will I do?Next ! step up in data reductionStudies ! recent and historical dataUtilizes ! a variety of statistical, modeling, data mining and machine learning techniquesAllows (potential inaccurate) predictions ! about the futureUse data you have, to create data you do not have!15! -25% penetration
ExamplesMarket developments!Stock developments!Movie/product recommendations on ! netflix/amazonEnergy demand and supply forecasting!Preditive! policing (e.g. precobs, predpol)
Fill in the blanks
Project the curve
![Page 27: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/27.jpg)
27
Prescriptive Analytics
What should I do?! Predicts “multiple futures” based on the potential actions! Recommends the best course of action for any
pre-specified outcome! Typically involves a feedback system to track outcome
produced by the action taken! Utilizes predictive methods + optimization techniques! 1-5% penetration
Examples! Energy load balancing by flexoffer scheduling! Inventory optimization in supply chains! Targeted marketing campaign optimization! Focus treatment of clinical obesity in health care! Waze-like car navigation
the potential actions
Annual savings in U.S. in 2016
nnual100 Annn
-
vingl
00 million s in U.S. in
gs inmiles
-- 1
10001011100
savingsav0 millio
gson milles
0 milthousand CO2 tones
11111110000010 -
-
00000000 mil0
000000 tnd
000000000000000000000 ttttttmillion fuel
iledd COOO2222 toooooooonesdddd COOO
gallons
111111110000 mmmm$300-
-
mmmmmmm0000-
27
thhhhhoooouuuuussssssaaaaaandttthhhh
iiillllion fffffffuuuuu
2dddd Ceeeeeeeeell ggggaaaaaaaaaaaaalllllllloooooooonnnns
mmiillionmmmillio
0000000-400 million costs
ORION N –OOnOROOnnn-
NNORIOnnn-Road dd Integrated OOOOOOOOnnnnnnnn RRRRRRRRRRooooooooooaaaaadddddd IInt
Optimization ratedtegr
nnnnn and Navigation
![Page 28: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/28.jpg)
28
Roles in Big Data Projects
Data scientistData science is a systematic method !dedicated to knowledge discovery via data analysisIn business, optimize organizational !processes for efficiencyIn science, analyze !experimental/observational data to derive resultsTypical skills!
Statistics + (mathematics) background-Computer science: Programming, e.g.: R, (SAS,) -Java, Scala, Python; Machine learningSome domain knowledge for the problem to solve-
Data engineer! Data engineering is the domain that
develops and provides systems for managing and analyzing big data
! Build modular and scalable data platforms for data scientists
! Deploy big data solutions! Typical skills
- Computer science background- Databases- Software engineering- Massively parallel processing- Real-time processing- Languages: C++, Java, (Scala,) Python- Understand performance factors and limitations
of systems
![Page 29: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/29.jpg)
Our Focus in Teaching and Research
![Page 30: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/30.jpg)
30
Levels of Analysis
Reporting
Analytics
Methods of Data Science
M!
MethodsMeth! Lecture
of Data Scienceof Dre Datenintegrationonnnnn und nd –d –analyse
![Page 31: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/31.jpg)
31
Big Data Landscape(s)
![Page 32: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/32.jpg)
32
Data Systems
data systems are in the middle of all this
a data system…! …stores data…! …provides access to data…! …and (ideally) makes data analysis easy
Different data systems use different data models
big data
datasystems
![Page 33: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/33.jpg)
33
(Big) Data System Landscape(s)
Concept and techniques of data systems
C!
Concept aConc! Lecture
ahniq
aArchitektur
ues ouesvon
of data systemsof da
Datenbanksystemen
!!!
on! Leeeeecturre! Leeeeee!!!!!!! Lecture
and techniqand te Arrrchitektu
ueur vooooon Daaaaaaaten
e Arrrc
rrrrrrreeeeeeee Big Data Platforms
![Page 34: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/34.jpg)
34
![Page 35: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/35.jpg)
35
Graph Building Blocks
Nodes (Dots)
! Like an entity in ER! Exist on their own! Have object identity
Edges (Lines)
Like a relationship in ER!Exist only between nodes!Identity depends on the !nodes they connect
![Page 36: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/36.jpg)
36
Graph Data – Social Network
![Page 37: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/37.jpg)
37
Graph Data – Social Network
![Page 38: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/38.jpg)
38
Graph Data – Social Network
![Page 39: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/39.jpg)
39
Graph Data
![Page 40: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/40.jpg)
40
Social Graphs
FacebookMay 2013!
As of 12/2014: “1.39 billion active users with more than 400 billion edges”
[Ching et al. One Trillion Edges: Graph Processing at Facebook-Scale. PVLDB (12)2015]
![Page 41: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/41.jpg)
41
Structured Search
Example: Facebook Graph SearchFinding subgraph structures!Very natural way of formulating queries!
[http://socialnewsdaily.com/15865/facebook-social-graph-search-a-great-way-to-find-working-professionals-in-your-network/]
![Page 42: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/42.jpg)
42
Social Network Analysis
Example: Twitter communication! Users tweeting on a specific topic! Others reply of retweet! Users can be grouped based on
communication topology (-> graph clustering)
! Analysis reveals user groups and dominant communication patterns
[https://nodexlgraphgallery.org/Pages/Graph.aspx?graphID=76277]
![Page 43: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/43.jpg)
43
Citation Networks
Example: DBLP! Open bibliographic information on major
computer science journals and proceedings>3.4 million publication!>7000 new publication per month!>1.7 million authors!
[http://well-formed.eigenfactor.org/projects/well-formed/radial.html#/][Emre Sarigöl et al. Predicting Scientific Success Based On Coauthorship Networks. EPJ Data Science, 2014]
![Page 44: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/44.jpg)
44
Viral Marketing
Viral Marketing! spreading content to one person so that more than one person engaging with the content! Techniques
- Influence estimation - Influence maximization
![Page 45: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/45.jpg)
45
Knowledge Graphs
Knowledge graph of a picture
[The National Library of Wales, https://www.llgc.org.uk/blog/?p=11246]
![Page 46: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/46.jpg)
46
Knowledge Graphs
Sächsische Landesbibliothek – Staats- und Universitätsbibliothek Dresden (SLUB)! Adds semantics search to library online catalog! Utilizing multi-lingual knowledge data from Wikipedia! Significant improvements in search quality for library users
[http://www.slideshare.net/JensMittelbach/dswarm-a-library-data-management-platform-based-on-a-linked-open-data-approach]
![Page 47: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/47.jpg)
47
sed
Supply Chain Management
Supply Chain OptimizationCustomer A: 10! -30% reduction in inventoryCustomer B! : 8% reduction in transportation costs
[http://www.logicblox.com/solutions/supply-chain-optimization/]
When to send?How much to send?From where to where?
much to send?From where to where?
?From where to where?
InventoryResponseResponseeeee
Times
![Page 48: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/48.jpg)
48
Level of Analytics
Graph Data Management Applications
![Page 49: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/49.jpg)
49
The Power of Networks
Watch on YouTube: https://youtu.be/nJmGrNdJ5Gw
![Page 50: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/50.jpg)
50
(Big) Data System Landscape(s)
!! Lecture re Graph Data Management nt andnnnnnnnd Analytics
![Page 51: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/51.jpg)
51
(Big) Data System Landscape(s)
![Page 52: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/52.jpg)
52
Use Cases of Search
Searching Textual Contents
![Page 53: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/53.jpg)
53
Use Cases of Search
Stack Overflow ! Programming Q & A site
![Page 54: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/54.jpg)
54
Use Cases of Search
Image Search
![Page 55: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/55.jpg)
55
Use Cases of Search
Last.fm ! Music Search
![Page 56: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/56.jpg)
56
Example: Library of Congress
![Page 57: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/57.jpg)
57
Databases versus Information Retrieval
Database Information RetrievalMatching Exact Match Partial Match, Best MatchModel Deterministic ProbabilisticQuery Language Structured / Formal NaturalQuery Specification Complete IncompleteQueried Objects Matching RelevantError Sensitivity Sensitive Insensitive
Hard to formulate Queries!
Iterative workflow base on reponses!
Tons of results, but only a few are relevant!
Ranking of results (instead of set of results)!
Representation of document content often inadequate / inacurate!
![Page 58: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/58.jpg)
58
(Big) Data System Landscape(s)
!! Lecture re Information Retrieval
![Page 59: Future Trends in Data Analytics - TU Dresden · 20 Data is produced continuously Velocity Humane-produced data!~300 million email sent/received per minute in 2016!>0.4 million tweets](https://reader036.fdocuments.in/reader036/viewer/2022070803/5f033a277e708231d4082a60/html5/thumbnails/59.jpg)
59
Summary
Big DataCrossing thresholds in ! exponential growth & digitizationTechnical challenges in volume, velocity, variety, veracity, value!
Related research and lecturesData Science ! ! Datenintegration und -analyse
Data ! Systems ! Architektur von Datenbanksystemen! Big Data Platforms
Graph ! Data ! Graph Data Management and Analytics
Search ! ! Information Retrieval