Web and Social Computing - Presentation Week8

35
Articles Overview Maneshka Paiva and Matt Courtney

Transcript of Web and Social Computing - Presentation Week8

Page 1: Web and Social Computing - Presentation Week8

Articles Overview

Maneshka Paiva and Matt Courtney

Page 2: Web and Social Computing - Presentation Week8

Article 1: Mining Dynamic Social Networks From Public News Articles For Company Value Prediction

Page 3: Web and Social Computing - Presentation Week8

Overview of Article

● Developed algorithms and a system to infer large-scale evolutionary company networks from public news during 1981-2009

● Prediction of company profits and revenue growth using the network changes over time

● Proposal of a feature extraction and selection algorithm for longitudinal networks

● Measures how networks affect company performance and what network features are important

Page 4: Web and Social Computing - Presentation Week8

Article’s Research Questions1. Is it possible to predict a company’s value(such as revenue and profit) based

on dynamic company networks? 2. How can we infer evolutionary company networks?

3. What features of a longitudinal network are useful for a company and how can they be generated?

Page 5: Web and Social Computing - Presentation Week8

Three company value prediction categories1. Use company’s financial statements

2. Use historical trends to identify price patterns and likely company activity

3. Social Network Analysis(SNA) to examine relational and structural embeddedness of companies on intercompany networks

Page 6: Web and Social Computing - Presentation Week8

Three company value prediction categories1. Use company’s financial statements

2. Use historical trends to identify price patterns and likely company activity

3. Social Network Analysis(SNA) to examine relational and structural embeddedness of companies on intercompany networks

→ by combining both historical and financial information!

Page 7: Web and Social Computing - Presentation Week8

DataExtracted Dataset:

● New York Times (1981-2009)○ Fortune 500 companies (minimum of 3 years)

Metrics:

● Co-occurrence of company names at document-level & sentence-level.○ Generate ‘impact score’ from two aggregated values. ○ Heuristically weight each factor to emulate natural

relationship.

Page 8: Web and Social Computing - Presentation Week8

“Longitudinal intercompany impact networks from public news (i.e. New York Times)”Network

Page 9: Web and Social Computing - Presentation Week8

Network

Microsoft 2003 Microsoft 2009

Page 10: Web and Social Computing - Presentation Week8

Data ProcessingDataset Generation

Given a set of companies (V), a time period (T), and a data source (D)

Extract inter-company networks at each given period GT = {Gt1, Gt2, …, Gtk}

Feature Vectors

Pick a company from the set V ( x ∈ V)

Generate a feature vector from its embeddedness in the inter-company networks GT (FTx )

Page 11: Web and Social Computing - Presentation Week8

Mining Network - Predicting Value72-dimensional temporal network effects generated

● Some of those features don’t have positive effect on company valuations.○ Feature selection can be beneficial to both accuracy and efficiency.

● Three methods of feature selection considered:○ Individual feature selection○ Network feature variance○ Feature set selection

Page 12: Web and Social Computing - Presentation Week8

Mining Network - Individual Feature SelectionUse spearman’s correlation to find important features (high correlation).

Page 13: Web and Social Computing - Presentation Week8

Mining Network - Individual Feature Selection

“If there is an Increase in the ratio of the number of connections that a company has with the number of connections that its neighbors have, then the value of its

profits will increase.”

Page 14: Web and Social Computing - Presentation Week8

Mining Network - Network Feature VarianceTune a network with a threshold variance:

● Some features will depend very sensitively on the existence of an edge.○ Measure feature variance in connected structure with different thresholds.

● If a feature has high associated variance it is due to it varying greatly if an edge with a highly connected neighbor is removed.○ This indicates how important a feature is for the network.

Page 15: Web and Social Computing - Presentation Week8

Mining Network - Network Feature Variance

Page 16: Web and Social Computing - Presentation Week8

● Maximise the sum of important scores and individual features.● Minimise the sum of similarity score between features.

Mining Network - Feature Set Selection

A more optimal outcome can be found if more than one feature is used.

Feature selection models from Geng et al used.

After generating a selecting a feature sets, they are used to predict company value.

Page 17: Web and Social Computing - Presentation Week8

Evaluation of prediction results

STEPS:1. Predict company values(profits and revenues) for 20 of the ‘Fortune 500’

companies

2. Effectiveness of network features and parameters on profits and revenues

Page 18: Web and Social Computing - Presentation Week8

Performance Evaluation measures

Squared Correlation Coefficient(r2) - To quantify correlation

Page 19: Web and Social Computing - Presentation Week8

Performance Evaluation measures

Mean Squared Error(MSE) - Error between the predicted valuations and the true valuation

Page 20: Web and Social Computing - Presentation Week8

Step 1: Company value prediction - (1)

● Select 20 large companies from different industries from the ‘Fortune 500’ list

● Companies Selected: IBM, Intel, Microsoft, GM, HP, Honda, Nissan, AT&T, Walmart, Yahoo, Nike, Dell, Starbucks, JP Morgan, Pepsi, Cisco Systems, FedEx, The Gap, American Electric Power, Sun Microsystems.

● These made it to the list continuously for many years and have information regarding company valuation and network.

Page 21: Web and Social Computing - Presentation Week8

Step 1: Company value prediction - (2)

● Learn profit model for each 5 years’ networks and predict next year profits● Then compare the actual profit earned in that year with predicted profit value

Page 22: Web and Social Computing - Presentation Week8

Step 1: Company value prediction - (3)

● Year 1995 - ONLY year prediction was much lower than real value

Page 23: Web and Social Computing - Presentation Week8

Step 1: Company value prediction - (4) ● Learn 10 years to predict following year’s profit

Page 24: Web and Social Computing - Presentation Week8

Step 2: Effectiveness of network features - (1)

● Evaluate effectiveness of network features

● Use different feature sets for predicting companies’ mean profits over the years and take average to compare prediction performance by each feature set.

Page 25: Web and Social Computing - Presentation Week8

Step 2: Effectiveness of network features - (2)

● Feature set notations:➔ s - Structural Features(network features)➔ t - Temporal Features➔ p - Financial Features➔ d - Delta change in temporal features

● Combine notation:➔ sp - network features + financial features➔ stdp - combination all features

Page 26: Web and Social Computing - Presentation Week8

Step 2: Effectiveness of network features - (3)

● Using ‘p’(financial profile) only has better performance than s, t, d individually● Combined feature sets improves the profit prediction performance● Using ‘stdp’(combination of all) feature set outperforms network(‘s’) only and

financial(‘p’) only feature sets by 150% and 34% respectively

Page 27: Web and Social Computing - Presentation Week8

Step 2: Effectiveness of network features - (4)

● Network features (‘s’, ‘t’, ‘d’) do not seem to contribute to revenue prediction● When looking at the graphs from above any combination that includes the

financial feature set ‘p’ shows a significant impact on the revenue prediction

Page 28: Web and Social Computing - Presentation Week8

Step 2: Effectiveness of network features - (5)

Learning outcomes of using network features:

● For profit prediction it can be seen that the financial feature set does contribute to profit prediction over the network feature sets.

● A combination of all features further improves the profit prediction

● Network features do not seem to contribute to revenue prediction. It is only the financial feature set

Page 29: Web and Social Computing - Presentation Week8

Step 2: Effectiveness of parameters - (1)

● Tune parameters window size and delta size

● Compare companies’ networks that existed 1 and 3 years prior

● Take the average of r2 of different years r2

Page 30: Web and Social Computing - Presentation Week8

Step 2: Effectiveness of parameters - (2)

● Both window size and delta size showed similar results● Better results when using networks from previous year(window,delta =1) rather

than 3 years prior

Page 31: Web and Social Computing - Presentation Week8

Step 2: Effectiveness of parameters - (3)

Learning outcomes of using parameters:

● One of either window or delta is adequate for profit prediction

● A window or delta of 1 previous year gives a more accurate prediction than that of 3 years prior networks

● Networks display a 1-year lagged impact on changes in a company’s

value

Page 32: Web and Social Computing - Presentation Week8

Article’s Research Questions - Learning Outcomes1. Is it possible to predict a company’s value(such as revenue and profit) based

on dynamic company networks?Yes, we have seen this by using dynamic social networks of companies

2. How can we infer evolutionary company networks?Developed an algorithm to infer longitudinal company networks

3. What features of a longitudinal network are useful for a company and how can they be generated?

Use of network features, financial features and combinations of these features for company value prediction. Deduced that network features contribute towards profit prediction but does not seem to help predict revenues.

Page 33: Web and Social Computing - Presentation Week8

Article 2: Network Science, Web Science and Internet Science

Page 34: Web and Social Computing - Presentation Week8

Definitions

Network Science - Understanding the emergence of networks, developing models to foresee how networks evolve and optimising networks.

Web Science - Study of large scale socio-technical system such as the WWW. This considers the relationship between people and technology, and the ways in which they complement each other and the impact it has on society.

Internet Science - A discipline that looks into the evolution of internet networks and society. The internet provides an infrastructure on which human activity is soon becoming heavily reliant.

Page 35: Web and Social Computing - Presentation Week8

Thanks for listening.Questions?