Web and Social Computing - Presentation Week8
-
Upload
matthew-courtney -
Category
Science
-
view
114 -
download
4
Transcript of Web and Social Computing - Presentation Week8
Articles Overview
Maneshka Paiva and Matt Courtney
Article 1: Mining Dynamic Social Networks From Public News Articles For Company Value Prediction
Overview of Article
● Developed algorithms and a system to infer large-scale evolutionary company networks from public news during 1981-2009
● Prediction of company profits and revenue growth using the network changes over time
● Proposal of a feature extraction and selection algorithm for longitudinal networks
● Measures how networks affect company performance and what network features are important
Article’s Research Questions1. Is it possible to predict a company’s value(such as revenue and profit) based
on dynamic company networks? 2. How can we infer evolutionary company networks?
3. What features of a longitudinal network are useful for a company and how can they be generated?
Three company value prediction categories1. Use company’s financial statements
2. Use historical trends to identify price patterns and likely company activity
3. Social Network Analysis(SNA) to examine relational and structural embeddedness of companies on intercompany networks
Three company value prediction categories1. Use company’s financial statements
2. Use historical trends to identify price patterns and likely company activity
3. Social Network Analysis(SNA) to examine relational and structural embeddedness of companies on intercompany networks
→ by combining both historical and financial information!
DataExtracted Dataset:
● New York Times (1981-2009)○ Fortune 500 companies (minimum of 3 years)
Metrics:
● Co-occurrence of company names at document-level & sentence-level.○ Generate ‘impact score’ from two aggregated values. ○ Heuristically weight each factor to emulate natural
relationship.
“Longitudinal intercompany impact networks from public news (i.e. New York Times)”Network
Network
Microsoft 2003 Microsoft 2009
Data ProcessingDataset Generation
Given a set of companies (V), a time period (T), and a data source (D)
Extract inter-company networks at each given period GT = {Gt1, Gt2, …, Gtk}
Feature Vectors
Pick a company from the set V ( x ∈ V)
Generate a feature vector from its embeddedness in the inter-company networks GT (FTx )
Mining Network - Predicting Value72-dimensional temporal network effects generated
● Some of those features don’t have positive effect on company valuations.○ Feature selection can be beneficial to both accuracy and efficiency.
● Three methods of feature selection considered:○ Individual feature selection○ Network feature variance○ Feature set selection
Mining Network - Individual Feature SelectionUse spearman’s correlation to find important features (high correlation).
Mining Network - Individual Feature Selection
“If there is an Increase in the ratio of the number of connections that a company has with the number of connections that its neighbors have, then the value of its
profits will increase.”
Mining Network - Network Feature VarianceTune a network with a threshold variance:
● Some features will depend very sensitively on the existence of an edge.○ Measure feature variance in connected structure with different thresholds.
● If a feature has high associated variance it is due to it varying greatly if an edge with a highly connected neighbor is removed.○ This indicates how important a feature is for the network.
Mining Network - Network Feature Variance
● Maximise the sum of important scores and individual features.● Minimise the sum of similarity score between features.
Mining Network - Feature Set Selection
A more optimal outcome can be found if more than one feature is used.
Feature selection models from Geng et al used.
After generating a selecting a feature sets, they are used to predict company value.
Evaluation of prediction results
STEPS:1. Predict company values(profits and revenues) for 20 of the ‘Fortune 500’
companies
2. Effectiveness of network features and parameters on profits and revenues
Performance Evaluation measures
Squared Correlation Coefficient(r2) - To quantify correlation
Performance Evaluation measures
Mean Squared Error(MSE) - Error between the predicted valuations and the true valuation
Step 1: Company value prediction - (1)
● Select 20 large companies from different industries from the ‘Fortune 500’ list
● Companies Selected: IBM, Intel, Microsoft, GM, HP, Honda, Nissan, AT&T, Walmart, Yahoo, Nike, Dell, Starbucks, JP Morgan, Pepsi, Cisco Systems, FedEx, The Gap, American Electric Power, Sun Microsystems.
● These made it to the list continuously for many years and have information regarding company valuation and network.
Step 1: Company value prediction - (2)
● Learn profit model for each 5 years’ networks and predict next year profits● Then compare the actual profit earned in that year with predicted profit value
Step 1: Company value prediction - (3)
● Year 1995 - ONLY year prediction was much lower than real value
Step 1: Company value prediction - (4) ● Learn 10 years to predict following year’s profit
Step 2: Effectiveness of network features - (1)
● Evaluate effectiveness of network features
● Use different feature sets for predicting companies’ mean profits over the years and take average to compare prediction performance by each feature set.
Step 2: Effectiveness of network features - (2)
● Feature set notations:➔ s - Structural Features(network features)➔ t - Temporal Features➔ p - Financial Features➔ d - Delta change in temporal features
● Combine notation:➔ sp - network features + financial features➔ stdp - combination all features
Step 2: Effectiveness of network features - (3)
● Using ‘p’(financial profile) only has better performance than s, t, d individually● Combined feature sets improves the profit prediction performance● Using ‘stdp’(combination of all) feature set outperforms network(‘s’) only and
financial(‘p’) only feature sets by 150% and 34% respectively
Step 2: Effectiveness of network features - (4)
● Network features (‘s’, ‘t’, ‘d’) do not seem to contribute to revenue prediction● When looking at the graphs from above any combination that includes the
financial feature set ‘p’ shows a significant impact on the revenue prediction
Step 2: Effectiveness of network features - (5)
Learning outcomes of using network features:
● For profit prediction it can be seen that the financial feature set does contribute to profit prediction over the network feature sets.
● A combination of all features further improves the profit prediction
● Network features do not seem to contribute to revenue prediction. It is only the financial feature set
Step 2: Effectiveness of parameters - (1)
● Tune parameters window size and delta size
● Compare companies’ networks that existed 1 and 3 years prior
● Take the average of r2 of different years r2
Step 2: Effectiveness of parameters - (2)
● Both window size and delta size showed similar results● Better results when using networks from previous year(window,delta =1) rather
than 3 years prior
Step 2: Effectiveness of parameters - (3)
Learning outcomes of using parameters:
● One of either window or delta is adequate for profit prediction
● A window or delta of 1 previous year gives a more accurate prediction than that of 3 years prior networks
● Networks display a 1-year lagged impact on changes in a company’s
value
Article’s Research Questions - Learning Outcomes1. Is it possible to predict a company’s value(such as revenue and profit) based
on dynamic company networks?Yes, we have seen this by using dynamic social networks of companies
2. How can we infer evolutionary company networks?Developed an algorithm to infer longitudinal company networks
3. What features of a longitudinal network are useful for a company and how can they be generated?
Use of network features, financial features and combinations of these features for company value prediction. Deduced that network features contribute towards profit prediction but does not seem to help predict revenues.
Article 2: Network Science, Web Science and Internet Science
Definitions
Network Science - Understanding the emergence of networks, developing models to foresee how networks evolve and optimising networks.
Web Science - Study of large scale socio-technical system such as the WWW. This considers the relationship between people and technology, and the ways in which they complement each other and the impact it has on society.
Internet Science - A discipline that looks into the evolution of internet networks and society. The internet provides an infrastructure on which human activity is soon becoming heavily reliant.
Thanks for listening.Questions?