Predicting the Stock Market with 250,000,000...

1
Predicting the Stock Market with 250,000,000 Tweets Mike del Balso, Alex Litoiu Design CONCLUSIONS REFERENCES ACKNOWLEDGEMENTS Introduc>on TESTING AND VERIFICATION 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Linear Regression 1 day before Tweets day of Tweets 1 day aBer Tweets 2 days aBer Tweets 3 days aBer Tweets 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 KNearest Neighbour Predic>on 1 day before Tweets day of Tweets 1 day aBer Tweets 2 days aBer Tweets 3 days aBer Tweets 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 Logis>c Regression Predic>on 1 day before Tweets day of Tweets 1 day aBer Tweets 2 days aBer Tweets 3 days aBer Tweets 0 500000 1000000 1500000 2000000 2500000 3000000 Date 20111024 20111027 20111030 20111102 20111105 20111108 20111111 20111114 20111117 20111120 20111123 20111126 20111129 20111202 20111205 20111208 20111211 20111214 20111217 20111220 20111223 20111226 20111229 20120101 20120104 20120107 Total Tweets Collected Per Day Professor Brendan Frey for his guidance with machine learning techniques and allowing us access to his lab computer throughout the duraLon of the term. Professor Hans Kunov for giving us great guidance with respect to project management and always pushing us in the right direcLon when we needed it. Dan Astoorian for helping us with special permissions on the ECF network and always making Lme to accommodate us and to troubleshoot. 1. U.S. Census Bureau. (2001). Sta$s$cal abstract of the United States: 2001. U.S. Department of Commerce. Washington D.C.: United States Government. 2. 10gen, Inc. (n.d.). mongoDB. Retrieved 02 14, 2012, from h[p://www.mongodb.org/ 3. Aite Group. (2011). Algorithmic Trading in FX: Ready for Takeoff? New York: Aite Group. 4. Facebook. (n.d.). Sta$s$cs of Facebook. Retrieved Octovwe 30, 2011, from Facebook: h[ps://www.facebook.com/press/info.php?staLsLcs 5. Gimpert, B. (2011, May 13). Sour Grapes: Seven Reasons Why “That” TwiMer Predic$on Model is Cooked. Retrieved February 14, 2012, from Some Ben?: h[p:// blog.someben.com/2011/05/sourgrapessevenreasonswhythattwi[erpredicLon modeliscooked/ 6. Johan Bollen, H. M.J. (2010). Twi[er mood predicts the stock market. Compu$ng Research Repository , abs/1010.3003. 7. Joliffe, I. (1986). Principle Component Analysis (2 ed.). New York: Springer. 8. Nielsen, F. Å. (2011, March). A new ANEW: EvaluaLon of a word list for senLment analysis in microblogs. Compu$ng Research Repository . 9. Maurice Lorr, P. D. (n.d.). Profile of Mood States. Retrieved February 14, 2012, from MULTIHEALTH SYSTEMS INC: h[p://www.mhs.com/product.aspx? gr=cli&id=overview&prod=poms#descripLon 10. Petzoldt, D. (2011, February 11). Sta$s$cal flaws in “TwiMer mood predicts the stock market” research paper. Retrieved February 14, 2012, from h[p://petzoldt.tumblr.com/ post/3236488086/staLsLcalflawsintwi[ermoodpredictsthestock 11. Rosenberg, D. (2010, April 20). What's (technically) in your tweets? Retrieved February 14, 2012, from CNet News: h[p://news.cnet.com/830113846_32000292462.html 12. Twi[er. (n.d.). Streaming API Methods. Retrieved 02 14, 2012, from h[ps:// dev.twi[er.com/docs/streamingapi/concepts 13. Twi[er: @twi[ereng. (2011, June 30). 200 million Tweets per day. Retrieved February 14, 2012, from Twi[er Blog: h[p://blog.twi[er.com/2011/06/200milliontweetsper day.html In the year 2000, the value of all trades in US financial markets exceeded $500 Trillion (U.S. Census Bureau, 2001), over fiBy Lmes the GDP of the United States for that year. A third of all stock trades were driven by algorithms . The Stock Market is greatly influenced by investor confidence and human emoLon. This project analyzed 1% of the 250,000,000 million daily tweets for the past 100 days for human emoLon in order to predict the stock market performance of various financial symbols. 0 50 100 150 200 250 300 350 Algorithmic Trading Manual Trading Trillion $USD Year 2000 US Financial Market Figure 1: Year 2000 US Financial Markets Novelty: Our novel contribuLons are to experiment with idenLfying common pockets of senLment using Kmeans clustering machine learning algorithms and drawing correlaLons to varied financial stocks and futures, as opposed to just a single index. There are few publicaLons that accurately predict the stock market based on twi[er. Those that claim to do so use quesLonable methods. Func>onal Requirement: Predict whether a stock, index, futures contract, exchange traded fund (ETF), fixedincome security, indicator, or mutual fund will go up or down in a future Lmeframe (provide an uncertainty along with this predicLon) Objec>ve: The lower the uncertainty, the be[er Collec>on and Processing Correla>ons 1 2 3 4 5 6 7 1. Data Collec>on – Download 2.5 million tweets per day 2. Sen>ment Analysis Analyzes each tweet for 13 dimensions of senLment: Tension, Depression, Anger, FaLgue, Vigor, and Confusion, etc. Assigns each dimension a numeric score 3. Dimensionality Reduc>on extracts important and someLmes hidden signals of senLment from the senLment data 4. Storage – MongoDB database stores 450GB of Tweets. Holds the raw tweets, senLment scores, clustering results, financial data, and all other intermediate results. 6. Model Learning – The predicLve models are learned that are used by the predictor to create the stock price predicLons. Models are learned based on training data. TesLng data is then used by the predicLon module. 7. Predic>on Module uses the derived Twi[er data to make predicLons about the stock price on the test data, the second parLLon of data menLoned in model learning module secLon. Uses linear and logisLc regression and Knearest neighbors predicLon algorithm 1. Data Collec>on a. Twi_er: Tweets manually verified for correctness; graph of tweets collected per day b. Finance: Manually checked for correctness 2. Sen>ment Analysis: Tweets analyzed in all dimensions by humans, and compared to machine values a. Sen>ment v1: Tested manually, and determined to be noisy. Process repeated as SenLment v2. b. Sen>ment v2: Compared human raLngs to predicted raLngs. 3. Dimensionality Reduc>on: Checked for convergence, and visualized results a. Clustering b. Grouping RESULTS 6 Model Learning and 7 PredicLon Module results in the Results SecLon Figure 2: Project Design Diagram Figure 3: Module 1a – Total Tweets Per Day 0 0.5 1 1.5 2 2.5 3 0 1 2 3 4 5 6 7 8 Average cluster shia Itera>on Clustering Step Shia 2 1 0 1 2 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Machineperceived emo>onal value Humanperceived emo>onal value Posi>vity Figure 4: Module 2b – PosiLvity dimension in senLment analysis Figure 6: Module 3b – Number of tweets in each group on Jan 30, 2012 All Modules were tested for correctness: Sen>ment Version 1 Sen>ment Version 2 Tested 11 financial symbols Training Data: October 22 nd 2011 to December 26 th 2011 (65 days). TesLng data: December 27 th to January 10 th 2012 (15 days). Twi[er SenLment v1 created noisy senLment dimensions. When trying to predict a stock’s performance based on the number of tweets in each group, the average predicLve rates were 41% Figure 7: Linear Regression PredicLve Rates Figure 5: Module 3a – Clustering average shiB distance Figure 8: KNearest Neighbor PredicLve Rates Figure 9: LogisLc Regression PredicLon Rates Tested 2747 financial symbols Training Data: October 22 nd 2011 to December 26 th 2011 (65 days). TesLng data: December 27 th to January 10 th 2012 (15 days). When trying to predict a stock’s performance based on the number of tweets in each group, the average predicLve rates were 54%. Across all stocks, Knearest Neighbor achieved 56% predicLon rate across all offsets, and 57.2% predicLon 1 day in advance Logis>c KNearest Linear 0 days 0.556 0.576 0.505 1 day 0.547 0.572 0.567 2 days 0.544 0.562 0.492 3 days 0.532 0.537 0.492 Average 0.544 0.561 0.502 Figure 10: PredicLve rates of all methods 0,1,2 or 3 days in advance Figure 11: Frequency of predicLve rates for KNearest Neighbor, 1 day in advance Figure 12: Figure of index fund performance over the date ranges tested Collected 250 million tweets and followed 2747 stock symbols for 6 months Analyzed 3 months of data for correlaLons. Using KNearest Neighbor, achieved average 56% predicLon rate across all 2747 stocks Next Steps Confirm results on a larger data set Training data has different characterisLcs from tesLng data. We can likely achieve be[er results by extending the length of the experiment All twi[er users are not created equal. Filtering tweets by influence of user is likely to improve results Filtering tweets by subject ma[er is likely to improve predicLon rate

Transcript of Predicting the Stock Market with 250,000,000...

Page 1: Predicting the Stock Market with 250,000,000 Tweetsalitoiu.com/static/Personal_Projects/Twitter_Stock_Market/Poster.pdf · Predicting the Stock Market with 250,000,000 Tweets Mike%del%Balso,%Alex%Litoiu%

Predicting the Stock Market with 250,000,000 Tweets Mike  del  Balso,  Alex  Litoiu  

Design   CONCLUSIONS  

REFERENCES  

ACKNOWLEDGEMENTS  

Introduc>on  

TESTING  AND  VERIFICATION  

0  

0.1  

0.2  

0.3  

0.4  

0.5  

0.6  

0.7  

0.8  

Linear  Regression  

1  day  before  Tweets  

day  of  Tweets  

1  day  aBer  Tweets  

2  days  aBer  Tweets  

3  days  aBer  Tweets  

0  

0.1  

0.2  

0.3  

0.4  

0.5  

0.6  

0.7  

K-­‐Nearest  Neighbour  Predic>on  

1  day  before  Tweets  

day  of  Tweets  

1  day  aBer  Tweets  

2  days  aBer  Tweets  

3  days  aBer  Tweets  

0  

0.1  

0.2  

0.3  

0.4  

0.5  

0.6  

0.7  

Logis>c  Regression  Predic>on  

1  day  before  Tweets  

day  of  Tweets  

1  day  aBer  Tweets  

2  days  aBer  Tweets  

3  days  aBer  Tweets  

0  500000  1000000  1500000  2000000  2500000  3000000  

Date  

20111024  

20111027  

20111030  

20111102  

20111105  

20111108  

20111111  

20111114  

20111117  

20111120  

20111123  

20111126  

20111129  

20111202  

20111205  

20111208  

20111211  

20111214  

20111217  

20111220  

20111223  

20111226  

20111229  

20120101  

20120104  

20120107  

Total  Tweets  Collected  Per  Day  

•  Professor  Brendan  Frey  for  his  guidance  with  machine  learning  techniques  and  allowing  us  access  to  his  lab  computer  throughout  the  duraLon  of  the  term.  

 •  Professor  Hans  Kunov  for  giving  us  great  guidance  with  respect  to  project  

management  and  always  pushing  us  in  the  right  direcLon  when  we  needed  it.    •  Dan  Astoorian  for  helping  us  with  special  permissions  on  the  ECF  network  and  

always  making  Lme  to  accommodate  us  and  to  troubleshoot.  

 

1.  U.S.  Census  Bureau.  (2001).  Sta$s$cal  abstract  of  the  United  States:  2001.  U.S.  Department  of  Commerce.  Washington  D.C.:  United  States  Government.  

2.  10gen,  Inc.  (n.d.).  mongoDB.  Retrieved  02  14,  2012,  from  h[p://www.mongodb.org/  3.  Aite  Group.  (2011).  Algorithmic  Trading  in  FX:  Ready  for  Takeoff?  New  York:  Aite  Group.  4.  Facebook.  (n.d.).  Sta$s$cs  of  Facebook.  Retrieved  Octovwe  30,  2011,  from  Facebook:  

h[ps://www.facebook.com/press/info.php?staLsLcs  5.  Gimpert,  B.  (2011,  May  13).  Sour  Grapes:  Seven  Reasons  Why  “That”  TwiMer  Predic$on  

Model  is  Cooked.  Retrieved  February  14,  2012,  from  Some  Ben?:  h[p://blog.someben.com/2011/05/sour-­‐grapes-­‐seven-­‐reasons-­‐why-­‐that-­‐twi[er-­‐predicLon-­‐model-­‐is-­‐cooked/  

6.  Johan  Bollen,  H.  M.-­‐J.  (2010).  Twi[er  mood  predicts  the  stock  market.  Compu$ng  Research  Repository  ,  abs/1010.3003.  

7.  Joliffe,  I.  (1986).  Principle  Component  Analysis  (2  ed.).  New  York:  Springer.  8.  Nielsen,  F.  Å.  (2011,  March).  A  new  ANEW:  EvaluaLon  of  a  word  list  for  senLment  

analysis  in  microblogs.  Compu$ng  Research  Repository  .  9.  Maurice  Lorr,  P.  D.  (n.d.).  Profile  of  Mood  States.  Retrieved  February  14,  2012,  from  

MULTI-­‐HEALTH  SYSTEMS  INC:  h[p://www.mhs.com/product.aspx?gr=cli&id=overview&prod=poms#descripLon  

10.  Petzoldt,  D.  (2011,  February  11).  Sta$s$cal  flaws  in  “TwiMer  mood  predicts  the  stock  market”  research  paper.  Retrieved  February  14,  2012,  from  h[p://petzoldt.tumblr.com/post/3236488086/staLsLcal-­‐flaws-­‐in-­‐twi[er-­‐mood-­‐predicts-­‐the-­‐stock  

11.  Rosenberg,  D.  (2010,  April  20).  What's  (technically)  in  your  tweets?  Retrieved  February  14,  2012,  from  CNet  News:  h[p://news.cnet.com/8301-­‐13846_3-­‐20002924-­‐62.html  

12.  Twi[er.  (n.d.).  Streaming  API  Methods.  Retrieved  02  14,  2012,  from  h[ps://dev.twi[er.com/docs/streaming-­‐api/concepts  

13.  Twi[er:  @twi[ereng.  (2011,  June  30).  200  million  Tweets  per  day.  Retrieved  February  14,  2012,  from  Twi[er  Blog:  h[p://blog.twi[er.com/2011/06/200-­‐million-­‐tweets-­‐per-­‐day.html  

In  the  year  2000,  the  value  of  all  trades  in  US  financial  markets  exceeded  $500  Trillion  (U.S.  Census  Bureau,  2001),  over  fiBy  Lmes  the  GDP  of  the  United  States  for  that  year.  A  third  of  all  stock  trades  were  driven  by  algorithms  .    

The  Stock  Market  is  greatly  influenced  by  investor  confidence  and  human  emoLon.  This  project  analyzed  1%  of  the  250,000,000  million  daily  tweets  for  the  past  100  days  for  human  emoLon  in  order  to  predict  the  stock  market  performance  of  various  financial  symbols.  

0  50  100  150  200  250  300  350  

Algorithmic  Trading   Manual  Trading  

Trillion  $U

SD  

Year  2000  US  Financial  Market  

Figure  1:  Year  2000  US  Financial  Markets  

Novelty:  Our  novel  contribuLons  are  to  experiment  with  idenLfying  common  pockets  of  senLment  using  K-­‐means  clustering  machine  learning  algorithms  and  drawing  correlaLons  to  varied  financial  stocks  and  futures,  as  opposed  to  just  a  single  index.  There  are  few  publicaLons  that  accurately  predict  the  stock  market  based  on  twi[er.  Those  that  claim  to  do  so  use  quesLonable  methods.  

Func>onal  Requirement:  Predict  whether  a  stock,  index,  futures  contract,  exchange  traded  fund  (ETF),  fixed-­‐income  security,  indicator,  or  mutual  fund  will  go  up  or  down  in  a  future  Lme-­‐frame  (provide  an  uncertainty  along  with  this  predicLon)    Objec>ve:  The  lower  the  uncertainty,  the  be[er  

Collec>on  and  Processing   Correla>ons  

1  

2  

3  

4  

5  

6  

7  

1.   Data  Collec>on  –  Download  2.5  million  tweets  per  day      

 2.   Sen>ment  Analysis  –  

Analyzes  each  tweet  for  13  dimensions  of  senLment:    Tension,  Depression,  Anger,  FaLgue,  Vigor,  and  Confusion,  etc.  Assigns  each  dimension  a  numeric  score    

3.   Dimensionality  Reduc>on  -­‐  extracts  important  and  someLmes  hidden  signals  of  senLment  from  the  senLment  data  

4.   Storage  –    MongoDB  database  stores  450GB  of  Tweets.  Holds  the  raw  tweets,  senLment  scores,  clustering  results,  financial  data,  and  all  other  intermediate  results.    

6.   Model  Learning  –    The  predicLve  models  are  learned  that  are  used  by  the  predictor  to  create  the  stock  price  predicLons.  Models  are  learned  based  on  training  data.  TesLng  data  is  then  used  by  the  predicLon  module.    

7.   Predic>on  Module  -­‐  uses  the  derived  Twi[er  data  to  make  predicLons  about  the  stock  price  on  the  test  data,  the  second  parLLon  of  data  menLoned  in  model  learning  module  secLon.  Uses  linear  and  logisLc  regression  and  K-­‐nearest  neighbors  predicLon  algorithm    

1.   Data  Collec>on  a.   Twi_er:  Tweets  

manually  verified  for  correctness;  graph  of  tweets  collected  per  day  

b.   Finance:  Manually  checked  for  correctness  

2.   Sen>ment  Analysis:  Tweets  analyzed  in  all  dimensions  by  humans,  and  compared  to  machine  values  a.   Sen>ment  v1:  Tested  

manually,  and  determined  to  be  noisy.  Process  repeated  as  SenLment  v2.  

b.   Sen>ment  v2:  Compared  human  raLngs  to  predicted  raLngs.  

3.   Dimensionality  Reduc>on:  Checked  for  convergence,  and  visualized  results  a.   Clustering  b.   Grouping  

RESULTS  

6  Model  Learning  and  7  PredicLon  Module  results  in  the  Results  SecLon  

Figure  2:  Project  Design  Diagram  

Figure  3:  Module  1a  –  Total  Tweets  Per  Day  

0  

0.5  

1  

1.5  

2  

2.5  

3  

0   1   2   3   4   5   6   7   8  

Average  cluster  shia    

Itera>on  

Clustering  Step  Shia  

-­‐2  

-­‐1  

0  

1  

2  

0   0.1   0.2   0.3   0.4   0.5   0.6   0.7   0.8   0.9   1  

Machine

-­‐perceived

 emo>

onal  

value  

Human-­‐perceived  emo>onal  value    

Posi>vity  

Figure  4:  Module  2b  –  PosiLvity  dimension  in  senLment  analysis  

Figure  6:  Module  3b  –  Number  of  tweets  in  each  group  on  Jan  30,  2012  

All  Modules  were  tested  for  correctness:  

Sen>ment  Version  1   Sen>ment  Version  2  •  Tested  11  financial  symbols  •  Training  Data:  October  22nd  2011  to  

December  26th  2011  (65  days).    •  TesLng  data:  December  27th  to  January  10th  

2012  (15  days).    •  Twi[er  SenLment  v1  created  noisy  

senLment  dimensions.    •  When  trying  to  predict  a  stock’s  

performance  based  on  the  number  of  tweets  in  each  group,  the  average  predicLve  rates  were  41%  

Figure  7:  Linear  Regression  PredicLve  Rates  

Figure  5:  Module  3a  –  Clustering  average  shiB  distance  

Figure  8:  K-­‐Nearest  Neighbor  PredicLve  Rates  

Figure  9:  LogisLc  Regression  PredicLon  Rates  

•  Tested  2747  financial  symbols  •  Training  Data:  October  22nd  2011  to  

December  26th  2011  (65  days).    •  TesLng  data:  December  27th  to  January  10th  

2012  (15  days).    •  When  trying  to  predict  a  stock’s  

performance  based  on  the  number  of  tweets  in  each  group,  the  average  predicLve  rates  were  54%.  

•  Across  all  stocks,  K-­‐nearest  Neighbor  achieved  56%  predicLon  rate  across  all  offsets,  and  57.2%  predicLon  1  day  in  advance  

Logis>c   K-­‐Nearest     Linear  0  days   0.556   0.576   0.505  1  day   0.547   0.572   0.567  2  days   0.544   0.562   0.492  3  days   0.532   0.537   0.492  Average   0.544   0.561   0.502  

Figure  10:  PredicLve  rates  of  all  methods  0,1,2  or  3  days  in  advance  

Figure  11:  Frequency  of  predicLve  rates  for  K-­‐Nearest  Neighbor,  1  day  in  advance  

Figure  12:  Figure  of  index  fund  performance  over  the  date  ranges  tested  

•  Collected  250  million  tweets  and  followed  2747  stock  symbols  for  6  months  

•  Analyzed  3  months  of  data  for  correlaLons.    •  Using  K-­‐Nearest  Neighbor,  achieved  average  56%  

predicLon  rate  across  all  2747  stocks  

Next  Steps  •  Confirm  results  on  a  larger  data  set  •  Training  data  has  different  characterisLcs  from  

tesLng  data.  We  can  likely  achieve  be[er  results  by  extending  the  length  of  the  experiment    

•  All  twi[er  users  are  not  created  equal.  Filtering  tweets  by  influence  of  user  is  likely  to  improve  results  

•  Filtering  tweets  by  subject  ma[er  is  likely  to  improve  predicLon  rate