The role of News Analytics in financial engineering: a review and the road ahead
description
Transcript of The role of News Analytics in financial engineering: a review and the road ahead
The role of News Analytics in The role of News Analytics in financial engineering: a review financial engineering: a review
and the road ahead and the road ahead
Gautam Mitra 7 December 2011 London
OutlineOutline Introduction
What… Why… How.
A commercial
News data Data sources Information Contents/Metadata Summary Information/Views Information/modelling architecture
Models and Applications Abnormal Returns News Enhanced Trading Strategies Risk Control
Case studies Risk Control News Analytics Toolkit Momentum study
Summary Conclusion
WHATWHAT News analytics : a working definitionNews analytics : a working definition
News analytics refers to the measurement of the various qualitative and quantitative attributes of textual news stories. Some of these attributes are: sentiment, relevance, and novelty. Expressing news stories as numbers permits the manipulation of …information in a mathematical and statistical way
< Taken from Wiki >
A news story is about an event
WHY WHY the research problem = the business problemthe research problem = the business problem
The world of financial analytics is concerned with three leading problems.
( i ) Pricing of assets in a temporal setting
( ii ) Making optimum investment decisions- low frequency or optimum trading decisions- high frequency
( iii )Controlling risk at different time exposures
HowHow
tthe messagehe message
Finance industry focuses on three major applications:
> High frequency :Trading strategies
> Low frequency :Investment strategies
> Risk control
By increasing the information set with quantified news the legacy models for the above applications can be enhanced
Knowledge from three disciplines are required
> Information engineering
> AI …Knowledge Engineering
> Financial Engineering
News
Market Environment
Sentiment
[Behavioural finance < greed..fear..irrational exuberance >………
Wall Street 1
Wall Street 2 => money never sleeps ]
IntroductionIntroduction
[ neo classical models for choice or decision making]
Trading Strategies/ Decisions
Investment Decisions
Risk Control Decisions
IntroductionIntroduction
R & D Challenge R & D Challenge Identify Killer Application Identify Killer Application
Smart investors rapidly analyse/digest information.
News stories/announcements.
Stock price moves (market reactions).
Act promptly to take trading/investment decisions.
Can a machine act intelligently(AI) to compete or outsmart humans ?
IntroductionIntroduction
CommercialCommercial Read
The Handbook of News Analytics in Finance
By: Gautam Mitra and Leela Mitra
< for an instant understanding ...! >
< or look up http://www.bis.gov.uk/foresight/our-work/projects/current-projects/computer-trading
The Future of Computer Trading in Financial Markets
Our report: Automated analysis of news to compute market sentiment: its impact on liquidity and trading...Gautam Mitra , Dan DiBartolomeo, Ashok Banerjee, Xiang Yu.
OutlineOutline Introduction
What… Why… How.
A commercial
News data Data sources Information Contents/Metadata Summary Information/Views Information/modelling architecture
Models and Applications Abnormal Returns News Enhanced Trading Strategies Risk Control
Case studies Risk Control News Analytics Toolkit Momentum study
Summary Conclusion
Which Asset classes....?
FX- Currency
Commodities
Fixed income (Bonds)
Stocks (Equities)
Wall Street proverb:
‘Stocks are stories bonds are mathematics’
News data: Data sources
ExchangeECN
Retail Brokers & Market Makers
Broker-Dealers & Market Makers
Retail CustomersInstitutional Customers
Customers
News Data Feed Providers
Market Data Feed ProvidersTertiary Market Participants
Main Market Participants
Traders [ High Frequency ]
Fund Managers [ Low Frequency ]
Desktop
• Market Data
• NewsWire
• Web < blogs, twitter, message boards >
Data WareHouse
DataMart
News data: Data sources
News data: Data sourcesNews data: Data sources Sources of news/informational flows (Leinweber)
News: Mainstream media, reputable sources. Newswires to traders desks. Newspapers, radio and TV.
Pre-News: Source data SEC reports and filings. Government agency reports. Scheduled announcements, macro economic news,
industry stats, company earnings reports…
Web based news Social media: Blogs, websites and message boards
Quality can vary significantly Barriers to entry low Human behaviour and agendas
News data: Data sourcesNews data: Data sources
Financial news can be split between Scheduled news (Synchronous) Unscheduled news (Asynchronous, event driven)
Scheduled news (Synchronous) Arrives at pre scheduled times Much of pre news Structured format < XML..XBRL > Often basic numerical format Typically macro economic announcements and earnings
announcements
News data: Data sourcesNews data: Data sources Unscheduled news (Asynchronous, event driven)
Arrives unexpectedly over time Mainstream news and social media Unstructured, qualitative, textual form Non-numeric Difficult to process quickly and quantitatively May contain information about effect and cause of an
event To be applied in quant models needs to be converted to an
input time series
Information contents/MetadataInformation contents/MetadataKey Attributes include:
Entity Recognition
Relevance
Novelty
Events categories
Sentiment
Preanalysis extracts/computes/mines these attributes and using text analysis and AI-classifiers sentiment scores are created This is the (news) metadata
Also the news flow/the intensity influences the resulting sentiment
Information/modelling architectureInformation/modelling architecture
Information value chainData… …information… knowledgeData analysis Data mart quant models
Mainstream News
Pre-News
Web 2.0Social Media
Pre-Analysis(Classifiers & others)
metadata
(Numeric) financial market data
Analysis Consolidated Data mart
Updated beliefs, Ex-ante view of market environment
Quant Models
1.Return Predictions2.Fund Management / Trading Decisions3.Volatility estimates and risk control
• Entity Recognition• Relevance• Novelty• Events• Sentiment Score
News Flow/Intensity
Analysis ..synthesis ..miningAnalysis ..synthesis ..miningentity recognitionentity recognition
Identify entities such as companies in news stories using point-in-time sensitive information:
Short names Long names Common abbreviations Common misspellings Securities identifiers Subsidiaries
Analysis ..synthesis ..mining Analysis ..synthesis ..mining relevancerelevance
Calculate the relevance of a story to a given company:
• Mentions in the text
• Positioning in the story (headline vs. last paragraph)
• Total number of companies mentioned
• Detect roles played by companies in the story
• Represent the context numerically
Analysis ..synthesis ..mining Analysis ..synthesis ..mining noveltynovelty
Is the news story "new" or novel?
• Elementize the various characteristics of a news story
• Distinguish between similar vs. duplicate stories
• Define a time window between stories
Example: Toyota’s Vehicle Recall (news flow in the first 30 minutes)
Analysis ..synthesis ..mining:Analysis ..synthesis ..mining: event categoriesevent categories
Company news and events are categorized:
• Identify actionable events
• The more detailed the event, the better
• Differentiate between scheduled vs. unscheduled news events
• Distinguish between explanatory or predictive inputs
Analysis ..synthesis ..miningAnalysis ..synthesis ..miningsentimentsentiment
Summary information and viewsSummary information and views
Thomson Reuters News Analytics Equity coverage and available data(i) Coverage(ii) Equity: All equities ............................34,037
(100.0%?)Active companies ................32,719
(96.1%)Inactive companies............. 1,318 (3.9%)
Equity coverage by region
Americas: ...............................14,785APAC: .....................................11,055EMEA:.......................................8,197
Equity Coverage Updates: Bi-weekly updated for recent changes (de-listings, M&A, IPOs).
History: Available from January 2003 (history kept for delisted companies; symbology
changes tracked).
RavenPack News AnalyticsEquity Coverage by RegionAll equities...................................28,279
(100%)Americas: ...................................11,950
(42.24%)Asia: ............................................8,858
(31.31%)Europe:...................................... 5,859
(20.71%)Oceania: ....................................436
(5.08%)Africa: .........................................186
(0.66%)For the most updated list of supported
companies download the companies.csv file at:
https://ravenpack.com/newsscores/Historical Data:Data format: Comma separated values
(.csv) filesDate/Time info: In Universal Coordinated
Time (UTC)Archive Range: Since Jan 1, 2005Archive Packaging: Monthly .csv files
compressed in .zip files on a per year basis
Summary information Summary information
Other suppliers
Deutsche Boerse < Alpha Flash >
Bloomberg ‘Black box newsfeed’
Dow Jones Elementized Newsfeed
Summary information and viewsSummary information and views
Tetlock et al. event study shows “information leakage”
Average Stock Price Reaction to Negative News EventsAverage Stock Price Reaction to Negative News Events
Source: Macquarie Quant Research –May 2009
Summary information and views
Average Stock Price Reaction to Positive News EventsAverage Stock Price Reaction to Positive News Events
Source: Macquarie Quant Research –May 2009
Summary information and views
Summary information and viewsSummary information and views
Illustration of Seasonality (Hafez, RavenPack)
RavenPack Sentiment ScoresRavenPack Sentiment Scores
Reuters NewsScope Sentiment Reuters NewsScope Sentiment EngineEngine
OutlineOutline Introduction
What… Why… How.
A commercial
News data Data sources Information Contents/Metadata Summary Information/Views Information/modelling architecture
Models and Applications Abnormal Returns News Enhanced Trading Strategies Risk Control
Case studies Risk Control News Analytics Toolkit Momentum study
Summary Conclusion
Model & Applications… (abnormal ) Model & Applications… (abnormal ) ReturnsReturns
Traders and quant managers … identify and exploit asset mispricings before they correct … generate alpha
News data can be used
Stock picking and generating trading signal
Factor models
Exploit behavioural biases in investor decisions
Model & Applications… (abnormal ) Model & Applications… (abnormal ) ReturnsReturns
Stock picking and generating trading signal
Sentiment reversal as buy signal: J Kitterell uses a sequence of
P, N scores as a means of testing sentiment reversal.
Momentum strategy enhanced by news sentiment scores Macquarie research also Sinha reports results with Thomson
Reuters data.
Model & Applications… (abnormal ) Model & Applications… (abnormal ) ReturnsReturns
Behavioural biases
Odean and Barber (2007) find evidence individual investors have a tendency to buy attention grabbing stocks.
Professional investors better equipped to assess a wider range of stocks they are less prone to buying attention grabbing stocks
Da, Engleberg and Gao also consider how the amount of attention a stock received affects its cross-section of returns.
Use the frequency of Google searches for a particular company as a measure of attention.
Find some evidence that changes in investor attention can predict the cross-section of returns.
Model & Applications… (abnormal ) Model & Applications… (abnormal ) ReturnsReturns
Stock picking and generating trading signal
Li (2006) simple ranking procedure … identify stocks with positive and negative sentiment 10 K SEC filings for non-financial firms 1994 – 2005 Risk sentiment measure – count number of times
wordsrisk, risks, risky, uncertain, uncertainty and uncertaintiesappear in management discussion and analysis section
Strategy long in low risk sentiment stocks short in high risk sentiment stocks … reasonable level returns
Leinweber (2010) – event studies based on Reuters NewsScope Sentiment Engine
News Enhanced Algorithmic TradingNews Enhanced Algorithmic Trading
1. Information/modelling architecture
2. Modelling architecture Pre-trade – Post trade Analysis
Characterize asset behaviour/dynamics by
i. Asset Price/Return
ii. Asset (Price) Volatility
iii. Asset (Price) Liquidity
Construct trading models using these measures
Price/Returns
Volatility
Liquidity
Market Data
Bid, Ask, Execution price, Time bucket
Predictive Analysis Model
News Meta Data
Time stamp, Company-ID, Relevance, Novelty, Sentiment score, Event category…
Pre-Trade Analysis
Automated Algo-Strategies Post Trade Analysis
Post Trade Analysis
Trade orders
Report
News DataMarket Data
Predictive
Analytics
Low Latency Execution Algorithms
Market Data
News Meta Data
(Analytic) Market
Data
Price, volatility, liquidityFeed
Feed
Ex-Post Analysis ModelEx-Ante Decision Model
Applications: Risk managementApplications: Risk management Traditionally historic asset price data has been
used to estimate risk measures. ex post retrospective measures fail to account for developments in the market
environment, investor sentiment and knowledge
Significant changes in the market environment Traditional measures can fail to capture the true level
of risk(Mitra, Mitra and diBartolomeo 2009; diBartolomeo and Warrick 2005)
Incorporating measures or observations of the market environment in risk estimation is important
EQUITY PORTFOLIO EQUITY PORTFOLIO RISK (VOLATILITY) ESTIMATION RISK (VOLATILITY) ESTIMATION USING MARKET INFORMATION USING MARKET INFORMATION
AND SENTIMENTAND SENTIMENT
Leela MitraCo-authors: Gautam Mitra and
Dan diBartolomeo .
Sponsored by:
Case study: OutlineCase study: Outline
Problem setting
Model description
Updating the model using quantified news
Study I
Study II
Discussion and conclusions
Introduction & backgroundIntroduction & background Tetlock et al. (2007) note there are three main
sources of information
Analyst forecasts
Publicly disclosed accounting variables
Linguistic descriptions of operating environments
If first two are incomplete third may give us relevant information
Tetlock et al. (2007) introduce “news” to a fundamental factor model
Problem settingProblem setting Three main types of factor models
Macroeconomic – use economic variables as factors (Chen, Ross and Roll; Sharpe)
Fundamental – based on firm specific (cross-sectional) attributes (BARRA and Fama-French)
Statistical – factors are unobservable and derived via calibration, often orthogonal.
Differ on sources of risk (uncertainty); can be shown to be rotations of each other.
Problem settingProblem setting Need for models to update risk structure as
environment changes
diBartolomeo and Warrick (2005) update covariance estimates using option implied volatility
Traders respond quickly in an intelligent fashion
CHANGES TO MARKET
ENVIRONMENT
TRADERSREACT
CHANGES IN OPTION IMPLIED
VOLATILITY
CHANGES IN ASSET
COVARIANCE MATRIX
Model descriptionModel description An extension of diBartolomeo & Warrick(2005)
In two parts
“Basic” statistical factor model
Factor variance estimates are updated for changes in option implied volatility
Model descriptionModel description We construct a statistical factor model using
principal component analysis to find orthogonal factors
Update the asset variances using option implied volatility data
Model descriptionModel description For each asset for which we have option
implied volatility data
We wish to identify the new factor variances and asset specific variances
implied by updated asset variances
Solve this set of simultaneous equations to derive the values, subject to some further conditions
Model descriptionModel description Further conditions
Allow for structure that is expected of principal component factors
Assume factor variances do not decline substantially from one period to the next
Similarly assume asset specific variances do not decline substantially from one period to the next
Study IStudy I Period 17 January 2008 to 23 January 2008
EURO STOXX 50
Market sentiment worsened
Option implied volatility measures surged
Few key events
Large interest rate cut
George Bush announced stimulus plan
Soc Gen hit by Jerome Kerviel rogue trader scandal
Study IStudy I
Portfolio volatility from option implied model
is higher than “basic” model
rises significantly on 21 January
Study IIStudy II Over 2008 markets fell
Loss of liquidity in credit markets and banking system
Many banks suffered bankruptcy or propped up
September and October 2008 – Volatility for financial firms particularly high
Lehman Bankruptcy
Lloyds takeover of HBOS
Restrictions on short selling of financials
Study IIStudy II
18 September 2008 to 24 September 2008
Dow Jones 30
Portfolio of three finance stocks Bank of America, CitiGroup and JP Morgan Chase Equal weight on each stock
Portfolio of three non-finance stocks Johnson & Johnson, Kraft Foods and Coca Cola Equal weight on each stock
Can the model predict impact in one sector…?
Study IIStudy II
Study IIStudy II
Information/modelling architectureInformation/modelling architecture
Information value chainData… …information… knowledgeData analysis Data mart quant models
Mainstream News
Pre-News
Web 2.0Social Media
Pre-Analysis(Classifiers & others)
metadata
(Numeric) financial market data
Analysis Consolidated Data mart
Updated beliefs, Ex-ante view of market environment
Quant Models
1.Return Predictions2.Fund Management / Trading Decisions3.Volatility estimates and risk control
• Entity Recognition• Relevance• Novelty• Events• Sentiment Score
News Flow/Intensity
News Analytics ToolkitNews Analytics Toolkit
Momentum StudyMomentum Study RSI (Relative Strength Indicator) with a 15 day timeframe
U = closenow − closeprevious if up period, 0 otherwise
D = closeprevious − closenow if down period, 0 otherwise
RS = EMA(U,n) / EMA(D,n)
EMA = n-period Exponential Moving Average
RSI = 100 – 100 / (1 + RS)
Asset Universe: FTSE100 and CAC40
Daily market data from Jan 2005 to Jan 2011
Portfolio Selection:
Ranked by the RSI Momentum Indicator
Long only, equally weighted
Calendar rebalancing frequency every 60 or 90 working days
Transaction Cost: 0.2%
Number of assets in portfolio: 10 for FTSE100, 5 for CAC40
Momentum StudyMomentum Study News enhanced Momentum Strategy
News provided by RavenPack News Score 1.5
Revised Ranking including Market Data and News Data
Companies are ranked according to average sentiment score
Only news with Relevance ≥ 75 and within the previous 15 days are considered
Momentum ranking and news ranking are combined with equal weights between news sentiment score and RSI score
Companies with no news in the period are considered to have an average sentiment score of 50 (neutral sentiment)
Momentum StudyMomentum Study FTSE 100, 90 days rebalancing
Momentum StudyMomentum Study CAC 40, 90 days rebalancing
Momentum StudyMomentum Study FTSE 100, 60 days rebalancing
Momentum StudyMomentum Study CAC 40, 60 days rebalancing
Summary & discussionsSummary & discussions Applications of (semi-)automated news
analytics in finance are growing in importance.
Pay back can be substantial to:
Investment Managers
Traders
Internal Risk Auditors
Regulators
Knowledge and Skills from three different disciplines:
Information Systems.
Artificial Intelligence.
Financial Engineering & quantitative modelling(including behavioural finance).
are required in various degrees to progress the field/make substantial impact.
Summary & discussionsSummary & discussions
Thank you....Thank you....
Thank you for your attention
Comments and Questions please