Social Mining Social Computing. Data Mining Data mining is an important new information technology...

25
Social Mining Social Computing

Transcript of Social Mining Social Computing. Data Mining Data mining is an important new information technology...

Page 1: Social Mining Social Computing. Data Mining  Data mining is an important new information technology used to identify significant data from vast amounts.

Social MiningSocial Computing

Page 2: Social Mining Social Computing. Data Mining  Data mining is an important new information technology used to identify significant data from vast amounts.

2

Data Mining Data mining is an important new information technology used

to identify significant data from vast amounts of records

It is also part of a process called knowledge discovery in databases, which presents and processes data to obtain knowledge.

Page 3: Social Mining Social Computing. Data Mining  Data mining is an important new information technology used to identify significant data from vast amounts.

3

Goal and Usefulness of Data Mining Goals:

Improve quality of interaction between the system and it’s users. Improve decision making

Usefulness: An automatic analysis and discovery tool for extraction of useful

knowledge from huge amounts of valuable information.

Page 4: Social Mining Social Computing. Data Mining  Data mining is an important new information technology used to identify significant data from vast amounts.

4

Knowledge Discovery Process

Page 5: Social Mining Social Computing. Data Mining  Data mining is an important new information technology used to identify significant data from vast amounts.

5

Data Mining Methods1. Association Rules

2. Clustering

3. Classification

4. Forecast

1. Decision trees and rules

2. Non-linear regression Classification Methods

3. Example based methods

4. Probabilistic Graphical Dependency Models

5. Relational Learning Models

Data Mining Tasks

Page 6: Social Mining Social Computing. Data Mining  Data mining is an important new information technology used to identify significant data from vast amounts.

6

Statistical Inference vs Data MiningFormal statistical Inference is “assumption-driven.”

Hypothesis is first formed and then validated against data.

Data Mining is “discovery-driven.”

In the sense, patterns and hypothesis are automatically extracted from data.

Page 7: Social Mining Social Computing. Data Mining  Data mining is an important new information technology used to identify significant data from vast amounts.

7

Data Mining – Practical Usage Direct Marketing; Fraud Control; Credit Analysis; Outlier Analysis.

Page 8: Social Mining Social Computing. Data Mining  Data mining is an important new information technology used to identify significant data from vast amounts.

8

Effective implementation of Data MiningA. Development of a Data Warehouse

Data Warehouse - Functions in three layers:

staging, integration and access.

The functions are in the DW to meet the users' reporting needs. Staging is used to store raw data for use by developers

(analysis and support). Integration layer is used to integrate data and to have a level

of abstraction from users. Access layer is for getting data out for users.

Page 9: Social Mining Social Computing. Data Mining  Data mining is an important new information technology used to identify significant data from vast amounts.

9

Contd.B. Ease and Simplicity of Data Mining Tools

Produce an automated real-time detection of patterns or anomalies.

Decision Support SystemsKnowledge Discovery in DatabasesData Warehouse

Page 10: Social Mining Social Computing. Data Mining  Data mining is an important new information technology used to identify significant data from vast amounts.

10

Contd.C. Knowledge of Data Analysis

Database specialists and computer scientists can contribute the most in this area.

Page 11: Social Mining Social Computing. Data Mining  Data mining is an important new information technology used to identify significant data from vast amounts.

11

Three chief facilities of Search Engines1. Gather a set of Web Pages that form the universe from

where users can retrieve information.

2. Represent pages in this universe in a fashion that attempts to capture their content.

3. They allow searchers to issue queries, employing information retrieval algorithms that attempt to find most relevant pages from the universe.

Page 12: Social Mining Social Computing. Data Mining  Data mining is an important new information technology used to identify significant data from vast amounts.

12

Data Mining and Web Search Engines A customer service database stores two types of service

information:

1. Unstructured customer service reports.

2. Structured Data on Sales, Employees, and Customers.

Most search engines have advanced search capabilities that will allow the user to specify additional search parameters to obtain more refined results.

DBMS acts as an access to involve search engines in a data warehouse environment.

Page 13: Social Mining Social Computing. Data Mining  Data mining is an important new information technology used to identify significant data from vast amounts.

13

Differences in Web Search and Data Mining Web searches are usually started with some sort of query in

a search engine.

While Data Mining does its searching based on the data itself, data mining tools and specified output format.

Page 14: Social Mining Social Computing. Data Mining  Data mining is an important new information technology used to identify significant data from vast amounts.

14

Role of Social Scientists Contribute to;

1. Research.

2. Development of Rules for flagging anomalous behavior.

3. Identify and understand elements in the data sets.

4. Develop guidelines and methods to ascertain which data mining techniques are the most effective in a particular case.

Page 15: Social Mining Social Computing. Data Mining  Data mining is an important new information technology used to identify significant data from vast amounts.

15

Assael’s Consumer Information Acquisition and Processing Model

Page 16: Social Mining Social Computing. Data Mining  Data mining is an important new information technology used to identify significant data from vast amounts.

16

Conceptual Model of Information and Source Utilization

Page 17: Social Mining Social Computing. Data Mining  Data mining is an important new information technology used to identify significant data from vast amounts.

17

Model of Information Needs

Page 18: Social Mining Social Computing. Data Mining  Data mining is an important new information technology used to identify significant data from vast amounts.

18

Consumer-Oriented Information Search Model

Page 19: Social Mining Social Computing. Data Mining  Data mining is an important new information technology used to identify significant data from vast amounts.

19

Contd.

Page 20: Social Mining Social Computing. Data Mining  Data mining is an important new information technology used to identify significant data from vast amounts.

20

Contd.

Page 21: Social Mining Social Computing. Data Mining  Data mining is an important new information technology used to identify significant data from vast amounts.

21

Contd.

Page 22: Social Mining Social Computing. Data Mining  Data mining is an important new information technology used to identify significant data from vast amounts.

22

Examples from the Economist According to the Economist, there’s a big market for such software. “By one estimate there are more than 100 programs for network

analysis, also known as link analysis or predictive analysis. The raw data used may extend far beyond phone records to encompass information available from private and governmental entities, and internet sources such as Facebook. IBM, the supplier of the system used by Bharti Airtel, says its annual sales of such software, now growing at double-digit rates, will exceed $15 billion by 2015. In the past five years IBM has spent more than $11 billion buying makers of network-analysis software. Gartner, a market-research firm, ranks the technology at number two in its list of strategic business operations meriting significant investment this year.”

Page 23: Social Mining Social Computing. Data Mining  Data mining is an important new information technology used to identify significant data from vast amounts.

23

For Example The article also touches on more sophisticated systems that

integrate additional information, including V.S. Subrahmanian’s work on STOP:

“Called SOMA(Stochastic Opponent Modeling Agents) is a formal, logical-statistical reasoning framework that uses data about past behavior of terror groups in order to learn rules about the probability of an organization, community, or person taking certain actions in different situations.)

SOMA Terror Organization Portal, it analyses a wide range of information about politics, business and society in Lebanon to predict, with surprising accuracy, rocket attacks by the country’s Hizbullah militia on Israel. Attacks tend to increase, for example, as more money from Islamic charities flows into Lebanon. Attacks decrease during election years, particularly as more Hizbullah members run for office and campaign energetically. By the middle of 2010 SOMA was sucking up data from more than 200 sources, many of them newspaper websites. The number of sources will have more than doubled by the end of the year.”

Page 24: Social Mining Social Computing. Data Mining  Data mining is an important new information technology used to identify significant data from vast amounts.

24

References www.emeraldinsight.com/0264-0473.htm www.emeraldinsight.com/0263-5577.htm www.emeraldinsight.com/0968-5227.htm www.economist.com/node/16910031 Journal of Financial Crime Vol.12 No.1

Page 25: Social Mining Social Computing. Data Mining  Data mining is an important new information technology used to identify significant data from vast amounts.

25

Thank You

Mohd. Ali Khan

Murtaza Marvi

Musa Bin Hamid

Syed Mohsin Hussain