Open Refine: Clean your messy data
description
Transcript of Open Refine: Clean your messy data
MSL
George A. Smathers Libraries
Marston Science Library
OPEN REFINE:CLEAN YOUR MESSY DATA
Valrie Minson
Outreach Librarian for Agricultural Sciences
MSL
OpenRefine OpenRefine.org (Google Refine) Open Source Runs locally on computer (privacy) Looks like Excel or Google
Spreadsheets Data: clean it, transform it, extend it!
MSL
Excel: my messy data
MSL
Data issues Human/free-text errors Inconsistent journal titles Redundant citations Data (volume/issue) in wrong fields ARTICLES IN CAPS LOCK
MSL
Data in Open Refine
MSL
Use filters to edit data
MSL
Faceted: journal titles by count
MSL
MSL
Filtering/faceting Use filters or facets to select subsets
of data Journal of Agriculture Journ of Agriculture Journla of Agriculture Agriculture, Journal of
Not just for Messy data
MSL
OpenRefine Expression Language (GREL)
Transform list into table (create columns)
Merge datasets Export into Excel, CSV, OpenOffice,
Google Spreadsheets, JSON, RDF, etc.
Use with other systems (Excel, SPSS, etc.)
Great videos