Online info2013 reconciliation

32
Reconciling ourselves to what's out there: how one dataset talks to another Tony Hirst Dept of Computing and Communications, The Open University UK O:I

description

 

Transcript of Online info2013 reconciliation

Page 1: Online info2013 reconciliation

Reconciling ourselves to what's out there: how one dataset talks to another

Tony HirstDept of Computing and Communications,

The Open University UKO:I

Page 2: Online info2013 reconciliation

I play with other people’s

data….

Page 3: Online info2013 reconciliation
Page 4: Online info2013 reconciliation
Page 5: Online info2013 reconciliation
Page 6: Online info2013 reconciliation
Page 7: Online info2013 reconciliation
Page 8: Online info2013 reconciliation
Page 9: Online info2013 reconciliation
Page 10: Online info2013 reconciliation
Page 11: Online info2013 reconciliation

Clustering and Approximate

Matching

Page 12: Online info2013 reconciliation

OpenRefine.org

Page 13: Online info2013 reconciliation
Page 14: Online info2013 reconciliation
Page 15: Online info2013 reconciliation

Metaphone3 (soundalike)

Page 16: Online info2013 reconciliation

metaphone( 'Epic Garments Limited’)EPKKRMNTSLMTT

metaphone( 'EPOCH GARMENTS LTD’)EPXKRMNTSLTT

Metaphone

Page 17: Online info2013 reconciliation

Levenshtein (edit distance)

Page 18: Online info2013 reconciliation

You know computers can do this anyway…

Page 19: Online info2013 reconciliation

..it’s just that no-one’s told you how you can

do it on your computer with your data…

Page 20: Online info2013 reconciliation

Reconcile your data

http://schoolofdata.org/2013/10/18/in-support-of-the-bangladeshi-garment-industries-data-expedition/

http://bit.ly/ScoDa-bg-reconcile

Page 21: Online info2013 reconciliation

opencorporates.com

Page 22: Online info2013 reconciliation

http://opencorporates.com/reconcile

Page 23: Online info2013 reconciliation
Page 24: Online info2013 reconciliation
Page 25: Online info2013 reconciliation
Page 26: Online info2013 reconciliation

cell.recon.match.name

Page 27: Online info2013 reconciliation

cell.recon.match.id

Page 28: Online info2013 reconciliation

In this way, we can make our data linkable…

Page 29: Online info2013 reconciliation

Reconcile your data with what’s

out there

Page 30: Online info2013 reconciliation

And why not have a go at

clustering too…?

Page 31: Online info2013 reconciliation

Can you match your

data to itself?

Page 32: Online info2013 reconciliation

O:iblog.ouseful.info

@psychemedia