Seek and you will find?Yes ... but probably not what you were expecting
NetIKX, 18th March 2015, #netikx72British Dental Association, 64 Wimpole Street, London W1G 8YS
Karen BlakemanRBA Information Services
http://www.rba.co.uk/[email protected]/karenblakeman
Presentation available for a short time at: http://www.rba.co.uk/as/
This work is licensed under a Creative Commons Attribution-Share-Alike License.
EU - so called “right to be forgotten” ruling
www.rba.co.uk 2
Mario Costeja Gonzalez
Edition of Monday, January 19, 1998, page 23 -
Newspaper - Lavanguardia.es
http://hemeroteca.lavanguardia.com/preview/19
98/01/19/pagina-23/33842001/pdf.html
EU Court of Justice ruled that Google is a “data
controller” under Data Protection legislation
and must remove, if requested, links to
information that is “inadequate, irrelevant .... or
excessive” from search results on a person’s
name.
Applies to search engines in the EU, Norway,
Lichtenstein, Iceland and Switzerland
17/03/2015
www.rba.co.uk 3
Scale of EU 'right to be forgotten' rules revealed by Google http://www.dailymail.co.uk/news/article-2952260/The-scale-EU-right-forgotten-rules-revealed-Google-says-forced-delete-260-000-links-legislation-criticised-protecting-terrorists-criminals.html
17/03/2015
www.rba.co.uk 4
Spanish Newspapers Suddenly Regret Forcing Google Out Of Spain -http://uk.businessinsider.com/spanish-newspapers-have-changed-their-minds-and-are-now-begging-google-news-to-stay-2014-12
How Google News Lives On In Spain Despite Being Closed http://searchengineland.com/google-noticias-lives-google-spains-homepage-211146
17/03/2015
Oh joy - NOT!
www.rba.co.uk 5
More UK information vanishes into GOV.UK http://www.rba.co.uk/wordpress/2015/02/28/more-uk-information-vanishes-into-gov-uk/
17/03/2015
Where’s the information gone to?
List of departments, agencies and public bodies at https://www.gov.uk/government/organisations
“Home pages” on GOV.UK
Data and information may still be on the old websites
Data may have been moved to http://data.gov.uk/
Information may have been sent to http://www.nationalarchives.gov.uk/webarchive/
Or information may have been “lost”
Don’t rely on just GOV.UK search – use Google/Bing site: command combined with filetype: if appropriate
www.rba.co.uk 617/03/2015
www.rba.co.uk 717/03/2015
"Yes Minister" The Skeleton in the Cupboard (TV Episode 1982) -
Quotes - IMDb http://www.imdb.com/title/tt0751825/quotes
8www.rba.co.uk
James Hacker: [reads memo] This file contains the complete set of papers, except for a number of secret documents, a few others which are part of still active files, some correspondence lost in the floods of 1967...James Hacker: Was 1967 a particularly bad winter?Sir Humphrey Appleby: No, a marvellous winter. We lost no end of embarrassing files.James Hacker: [reads] Some records which went astray in the move to London and others when the War Office was incorporated in the Ministry of Defence, and the normal withdrawal of papers whose publication could give grounds for an action for libel or breach of confidence or cause embarrassment to friendly governments.James Hacker: That's pretty comprehensive. How many does that normally leave for them to look at?James Hacker: How many does it actually leave? About a hundred?... Fifty?... Ten?... Five?... Four?... Three?... Two?... One?... *Zero?*Sir Humphrey Appleby: Yes, Minister.
[Add “transfer to GOV.UK” to the list of excuses]
17/03/2015
http://www.nationalarchives.gov.uk/webarchive/
www.rba.co.uk 917/03/2015
Remember - Google knows best
Google very kindly....
1. Goes to great lengths to personalise your results according
to your search history, contacts, location, device, phase of
the moon, the train, bus or tram you take to work and
anything else it can think of
2. Rewrites your search for you by leaving out some of your
terms and looking for weird and wonderful alternatives
3. Doesn’t bother you with everything that might be relevant
4. Changes its algorithms on a regular basis to keep you on
your toes
5. Constantly conducts experiments on you to ensure that you
don’t feel forgotten
www.rba.co.uk 1117/03/2015
Google no longer looks at keywords in isolation
Tries to make “sense” of your search and put it into context, natural language queries, uses what others have searched and clicked on
Constantly changing – all bets are off when it comes to predicting what your results will look like
How you ask your question is taken into account, device you are using is taken into account
Providing Quick Answers and “facts”, extracts from websites giving you the “answer”
www.rba.co.uk 1217/03/2015
www.rba.co.uk 1317/03/2015
www.rba.co.uk 1417/03/2015
17/03/2015 www.rba.co.uk 15
www.rba.co.uk 1617/03/2015
www.rba.co.uk 1717/03/2015
www.rba.co.uk 1817/03/2015
www.rba.co.uk 19
What could possibly go wrong?
17/03/2015
www.rba.co.uk 2017/03/2015
www.rba.co.uk 21
And then on another day...
17/03/2015
http://googlesystem.blogspot.co.uk/2013/11/google-
knowledge-graph-gets-confused.html
www.rba.co.uk 2217/03/2015
One of many wrong Quick Answers submitted to me by a
delegate at a recent conference
www.rba.co.uk 23
Many thanks to Philip Stirups for the example. About 24 hours after taking this screen shot Google corrected the error.
17/03/2015
www.rba.co.uk 24
Google "Henry VIII wives": Jane Seymour reveals search engine's blind spots
http://www.slate.com/blogs/future_tense/2013/09/23/google_henry_viii_wives_
jane_seymour_reveals_search_engine_s_blind_spots.html
Image courtesy of Will Oremus
17/03/2015
www.rba.co.uk 2517/03/2015
Waitrose Caversham opening times New Year’s Day
www.rba.co.uk 26
Google used the standard opening times in its answer, not the seasonal opening times
17/03/2015
www.rba.co.uk 27
http://searchengineland.com/google-shows-source-credit-quick-answers-knowledge-graph-203293
But Google’s choice of “basic factual data” may be wrong!
17/03/2015
www.rba.co.uk 28
Google wants to rank websites based on facts not links - 28 February 2015 -New Scientist http://www.newscientist.com/article/mg22530102.600-google-wants-to-rank-websites-based-on-facts-not-links.html
17/03/2015
http://arxiv.org/abs/1502.03519
Or maybe not....
www.rba.co.uk 29
Google: We Are Not Using Facts For Search Engine Ranking Now https://www.seroundtable.com/google-fact-ranking-not-happening-19979.html
17/03/2015
Artificial intelligence
www.rba.co.uk 30
Artificial Intelligence machine plays video games like a pro - CBBC Newsround http://www.bbc.co.uk/newsround/31633702
Google buys UK artificial intelligence startup Deepmind for £400m http://www.theguardian.com/technology/2014/jan/27/google-acquires-uk-artificial-intelligence-startup-deepmind
Google buys two more UK artificial intelligence startupshttp://www.theguardian.com/technology/2014/oct/23/google-uk-artificial-intelligence-startups-machine-learning-dark-blue-labs-vision-factory
17/03/2015
www.rba.co.uk 31
But official data is OK isn’t it?
17/03/2015
http://www.google.com/publicdata/
Google Public Data Explorer Minimum Wage
www.rba.co.uk 32
Some countries are missing e.g. Germany
17/03/2015
Eurostat - Minimum Wage
www.rba.co.uk 3317/03/2015
www.rba.co.uk 3417/03/2015
http://www.rightmove.co.uk/ - uses Land Registry data
www.rba.co.uk 35
Land Registry data often goes missing. I know that 10 months ago the sold price for number 90 was listed as £185,000 and for 2012.
17/03/2015
www.rba.co.uk 36
http://landregistry.data.gov.uk/app/ppd/search
Data doesn’t show up via the Land Registry Open Data interface either.
17/03/2015
Missing data
www.rba.co.uk 37
Error report filed with the Land Registry - still waiting for a response
Why might a property/price paid not appear in the data?
Seems not that uncommon according to discussion boards – usually data entry error (but the above example was in the open data sets until a few months ago)
Absence of price – gift of property or purchase of a share
Impractical to calculate price e.g. bulk purchase of properties
Commercial transactions
https://www.gov.uk/about-the-price-paid-data#data-excluded-from-the-house-price-index-and-price-paid-data
Raw data files downloaded and searched and data for number 90 is missing
17/03/2015
Title document
www.rba.co.uk 38
Data IS present in the title document.
17/03/2015
www.rba.co.uk 39
Free Companies House data to boost UK economy - Press releases - GOV.UK https://www.gov.uk/government/news/free-companies-house-data-to-boost-uk-economy
17/03/2015
Companies House free data
http://download.companieshouse.gov.uk/en_accountsdata.html
Bulk data – all or nothing
Large daily files available as zipped files
No support provided – you’re on your own!
www.rba.co.uk 4017/03/2015
Companies House free data
www.rba.co.uk 41
Each file within the zipped file is a separate document. Note the uninformative file names!
17/03/2015
www.rba.co.uk 42
Variable Pitch http://www.variablepitch.co.uk/stations/1310/
Uses public electricity micro-generation data
17/03/2015
www.rba.co.uk 43
Variable Pitch http://www.variablepitch.co.uk/stations/2580/
Virginia Station is the hydroelectric installation at Windsor Castle – no data!
17/03/2015
FoI request generation data for Virginia Station
www.rba.co.uk 44
https://www.whatdotheyknow.com/request/request_electricity_output_of_ro
http://www.independent.co.uk/news/uk/home-news/royal-family-granted-new-right-of-secrecy-2179148.html
17/03/2015
And finally.....
Per capita consumption of cheese (US) correlates with Number of
people who died by becoming tangled in their bedsheets
http://www.tylervigen.com/view_correlation?id=7
www.rba.co.uk 4517/03/2015
Top Related