Social network architecture - Part 3. Big data - Machine learning
-
Upload
phu-luong-trong -
Category
Software
-
view
105 -
download
0
Transcript of Social network architecture - Part 3. Big data - Machine learning
![Page 1: Social network architecture - Part 3. Big data - Machine learning](https://reader034.fdocuments.in/reader034/viewer/2022042615/55a77fe61a28ab5f268b4681/html5/thumbnails/1.jpg)
<SOCIAL NETWORK ARCHITECTURE>@DEV ZONE
![Page 2: Social network architecture - Part 3. Big data - Machine learning](https://reader034.fdocuments.in/reader034/viewer/2022042615/55a77fe61a28ab5f268b4681/html5/thumbnails/2.jpg)
Overview architecture
Web Apps
Web Service APIs
Mobile Apps
4. Front-end
SSOUser
ranking
1. Core User
User Data Storage
Real-time Notification
News Feed
2. User Activity System
User Activity Storage
3. Others
Real-time Chat
Search System Suggestion System
3. Big Data System
Big Data Storage
…
External Apps
Service Data
UserAdministrator
![Page 3: Social network architecture - Part 3. Big data - Machine learning](https://reader034.fdocuments.in/reader034/viewer/2022042615/55a77fe61a28ab5f268b4681/html5/thumbnails/3.jpg)
BIG Data
![Page 4: Social network architecture - Part 3. Big data - Machine learning](https://reader034.fdocuments.in/reader034/viewer/2022042615/55a77fe61a28ab5f268b4681/html5/thumbnails/4.jpg)
Definitions
Dani Ariely defined:
![Page 5: Social network architecture - Part 3. Big data - Machine learning](https://reader034.fdocuments.in/reader034/viewer/2022042615/55a77fe61a28ab5f268b4681/html5/thumbnails/5.jpg)
Definitions
• Wiki. Big data is a broad term for data sets so large or complex that traditional data processing applications are inadequate. Challenges include analysis, capture, curation, search, sharing, storage, transfer, visualization, and information privacy. The term often refers simply to the use of predictive analytics or other certain advanced methods to extract value from data, and seldom to a particular size of data set
• Intel. Big data opportunities emerge in organizations generating a median of 300 terabytes of data a week. The most common forms of data analyzed in this way are business transactions stored in relational databases, followed by documents, e-mail, sensor data, blogs, and social media.
• Microsoft. “Big data is the term increasingly used to describe the process of applying serious computing power—the latest in machine learning and artificial intelligence—to seriously massive and often highly complex sets of information.”
• Oracle. Big data is the derivation of value from traditional relational database-driven business decision making, augmented with new sources of unstructured data.
![Page 6: Social network architecture - Part 3. Big data - Machine learning](https://reader034.fdocuments.in/reader034/viewer/2022042615/55a77fe61a28ab5f268b4681/html5/thumbnails/6.jpg)
Definitions
• Gartner. The increasing size of data, the increasing rate at which it is produced and the increasing range of formats and representations employed. This report predated the term “dig data” but proposed a three-fold definition encompassing the “three Vs”: Volume, Velocity and Variety. This idea has since become popular and sometimes includes a fourth V: veracity, to cover questions of trust and uncertainty.
![Page 7: Social network architecture - Part 3. Big data - Machine learning](https://reader034.fdocuments.in/reader034/viewer/2022042615/55a77fe61a28ab5f268b4681/html5/thumbnails/7.jpg)
![Page 8: Social network architecture - Part 3. Big data - Machine learning](https://reader034.fdocuments.in/reader034/viewer/2022042615/55a77fe61a28ab5f268b4681/html5/thumbnails/8.jpg)
Definitions
IBM defined:
• Capture data
• Manage data
• Analyze data
![Page 9: Social network architecture - Part 3. Big data - Machine learning](https://reader034.fdocuments.in/reader034/viewer/2022042615/55a77fe61a28ab5f268b4681/html5/thumbnails/9.jpg)
Big Data Architecture
![Page 10: Social network architecture - Part 3. Big data - Machine learning](https://reader034.fdocuments.in/reader034/viewer/2022042615/55a77fe61a28ab5f268b4681/html5/thumbnails/10.jpg)
Data analysis
• Artificial Intelligence - AI• Machine learning
• Robotics
• Computer vision
![Page 11: Social network architecture - Part 3. Big data - Machine learning](https://reader034.fdocuments.in/reader034/viewer/2022042615/55a77fe61a28ab5f268b4681/html5/thumbnails/11.jpg)
Machine learning
Applications:• Data analysis: stock market, financial market, user action …
• Weather forecast
• Natural Language Processing
• Search engine
![Page 12: Social network architecture - Part 3. Big data - Machine learning](https://reader034.fdocuments.in/reader034/viewer/2022042615/55a77fe61a28ab5f268b4681/html5/thumbnails/12.jpg)
Machine learning
Methods:• Supervised learning
• Unsupervised learning
• Semi-supervised learning• Reinforcement learning
• Data mining• Data exploration
1 2 3 4
x y z
DATA SET
CLUSTERS
DATA SET
CLASSES
![Page 13: Social network architecture - Part 3. Big data - Machine learning](https://reader034.fdocuments.in/reader034/viewer/2022042615/55a77fe61a28ab5f268b4681/html5/thumbnails/13.jpg)
Machine learning
Supervised learning application• Classify data set
Supervised learning algorithms
1. Decision tree
2. Neuron network
3. Naive Bayes classifier
…
1. Decision tree
2. Neuron network
![Page 14: Social network architecture - Part 3. Big data - Machine learning](https://reader034.fdocuments.in/reader034/viewer/2022042615/55a77fe61a28ab5f268b4681/html5/thumbnails/14.jpg)
Machine learning
Problems & Solutions
![Page 15: Social network architecture - Part 3. Big data - Machine learning](https://reader034.fdocuments.in/reader034/viewer/2022042615/55a77fe61a28ab5f268b4681/html5/thumbnails/15.jpg)
Machine learning
Decision tree1. Outlook(sunny)Humidity(High)NO
2. Outlook(sunny)Humidity(Normal)YES
3. Outlook(overcast)Yes
4. Outlook(rainy)Windy(TRUE)NO
5. Outlook(rainy)Windy(FALSE)YES
![Page 16: Social network architecture - Part 3. Big data - Machine learning](https://reader034.fdocuments.in/reader034/viewer/2022042615/55a77fe61a28ab5f268b4681/html5/thumbnails/16.jpg)
Machine learningBuild root node
• Evaluate attributes: Outlook, Temperature, Humidity, Windy• entropy(X) = −∑p(x)logp(x): x ∈ 𝑋
• info([2,3]) = entropy(2
5,3
5) = 0.971
• info([4,0]) = entropy(4
4,0
4) = 0
• info([3,2]) = entropy(3
5,2
5) = 0.971
• info([2,3], [4,0], [3,2]) = 5
14info 2,3 +
4
14info 4,0 +
5
14info 3,2 = 0.693
→Gain(outlook) = info([9,5]) - info([2,3], [4,0], [3,2]) = 0.247
→Gain(temp.) = 0.029→Gain(humidity) = 0.152→Gain(windy) = 0.048
![Page 17: Social network architecture - Part 3. Big data - Machine learning](https://reader034.fdocuments.in/reader034/viewer/2022042615/55a77fe61a28ab5f268b4681/html5/thumbnails/17.jpg)
Machine learning
• Full decision tree
![Page 18: Social network architecture - Part 3. Big data - Machine learning](https://reader034.fdocuments.in/reader034/viewer/2022042615/55a77fe61a28ab5f268b4681/html5/thumbnails/18.jpg)
Machine learning
Classifying Steps:
for( i=0; i<n; i++ ){1. Split data set
• Training set: n-1
• Test set: 1
• n=10 is optimal
2. Training→Model
3. Test→ Error rate
}
![Page 19: Social network architecture - Part 3. Big data - Machine learning](https://reader034.fdocuments.in/reader034/viewer/2022042615/55a77fe61a28ab5f268b4681/html5/thumbnails/19.jpg)
Advances
• Build training set
• Reinforcement learning and applications
• Machine leaning algorithms
![Page 20: Social network architecture - Part 3. Big data - Machine learning](https://reader034.fdocuments.in/reader034/viewer/2022042615/55a77fe61a28ab5f268b4681/html5/thumbnails/20.jpg)
Demo
• Training with data set
• Test model
• Classifying