Data 101- Big Data: What is it and Why Do We Care?

Post on 20-Aug-2015

1.590 views 2 download

Tags:

Transcript of Data 101- Big Data: What is it and Why Do We Care?

BIG DATA What is it and Why Do We Care?

Elaine M. Lasda BergmanUniversity at Albany

March 6, 2014elasdabergman@albany.edu

Webinar Presentation for the Special Libraries Association

What we’re going to cover today

• What is Big Data• What is great about Big Data• What is not so great• The role of Librarians and Info Pros in the Big

Data landscape• Tools and Resources

How Big is Big?

http://breadboxes.info/files/2012/01/bread-box.jpg

The Three Vs

•Variety •Velocity •Volume

Big Data Vs Open Data

Based on http://www.opendatanow.com/2013/11/new-big-data-vs-open-data-mapping-it-out/

BIG DATA OPEN GOV’T

OPEN DATA

Is Big Data a Game Changer?

http://bellwethergames.com/images/stories/blog/salvaged%20bits.jpg

Types of Data Scientists

• Statistics• Mathematics• Data Engineering• Machine Learning• Business • Software engineering• Visualization• GIShttp://www.datasciencecentral.com/profiles/blogs/six-categories-of-data-scientists

Big Data is FANTASTIC!

http://4206e9.medialib.glogster.com/media/6bde80470b0f0ffe3b59b390fcb54a117c65f2406a167bd2589cabc3e9601461/excited-smiley-face.jpg

Applications of Big Data (in general)

ttp://analytics-arena.blogspot.com/2012/12/the-famous-beer-diaper-planogram.html

BIG Data is TERRIBLE!

http://startupmixology.tech.co/2010-chicago/staff/harper-reed

Caveats and limitations

http://www.guy-sports.com/fun_pictures/no_brain.jpg

False Correlations

http://www.cdc.gov/healthyweight/images/height.jpghttp://www.sbsd.k12.ca.us/cms/lib02/CA01001886/Centricity/Domain/569/kids_reading.jpg

Competencies for Info Pros/Librarians

Add Data Literacy!

http://remc12.wikispaces.com/file/view/InformationLit.jpg/32256581/InformationLit.jpg

What We Just Talked About

• The Three V’s• Amazing Capabilities• The Human Element• Our Roles as Information Professionals

Now the Fun Stuff!

http://www.whee.com.sg/images/common/logo-whee.png

Read!

• Big Data: A Revolution that Will Transform How We Live, Work, and Think, by Viktor Mayer-Schonberger http://www.amazon.com/Big-Data-Revolution-Transform-Think/dp/0544002695

• “For Dummies” Books

Read!

• An Introduction to Data Science, by Jeffrey Stanton http://jsresearch.net/

• Frontiers in Massive Data Analysis http://www.nap.edu/catalog.php?record_id=18374

General Resource Lists/Training

• Syracuse University Library Guide on Data Science http://researchguides.library.syr.edu/datascience

• ALA ACRL “Keeping Up With Big Data” page http://www.ala.org/acrl/publications/keeping_up_with/big_data

• Data Information Literacy at Purdue wiki http://wiki.lib.purdue.edu/display/ste/Home

• MOOCs

Policy/Best Practices

• Council For Big Data, Ethics and Society

http://www.datasociety.net/initiatives/council-for-big-data-ethics-and-society/

• Research Data Management Principles, Practices, and Prospects – CLIR

http://www.clir.org/pubs/reports/pub160

Policy/Best Practices

• Rebuilding the Mosaic http://www.nsf.gov/pubs/2011/nsf11086/nsf11086.pdf

• GovLab

http://thegovlab.org/

• Terminology issues

Keep Current

Newsletters• Data Science Weekly http://www.datascienceweekly.org/

• Data Science Central http://www.datasciencecentral.com/

• R-Bloggers http://www.r-bloggers.com/

Keep Current

Blogs• Hilary Mason http://www.hilarymason.com/

• Mathbabe http://mathbabe.org/

• Bits Blog in NY Times http://bits.blogs.nytimes.com/

• No Free Hunch http://blog.kaggle.com/

• What’s the Big Data http://whatsthebigdata.com/

PLAY!

http://brainysmurf1234.files.wordpress.com/2011/10/sand-castle.png

Big, Open Data Sources

http://lightworkersalliance.com/wp-content/uploads/2011/06/Open-Door1.jpg

Google Data Explorer https://www.google.com/publicdata/directory

Amazon Web Services http://aws.amazon.com/

Scale Unlimitedhttp://www.scaleunlimited.com/datasets/public-datasets/

Database Structure/Data Analysis

• R http://cran.us.r-project.org/

• Hive/Hadoop http://hive.apache.org/

• PostgreSQL http://www.postgresql.org/

• Project Bamboo Dirt http://dirt.projectbamboo.org/

• Mlcomp http://mlcomp.org/

Visualization tools

http://us.123rf.com/400wm/400/400/lucadp/lucadp1204/lucadp120400012/13060060-one-crystal-ball-with-a-bar-chart-inside-it-a-concept-of-financial-and-business-forecasts-3d-render.jpg

Piktocharthttp://piktochart.com/

Esrihttp://www.esri.com/

Big MLhttps://bigml.com/

ManyEyeshttp://www-958.ibm.com/software/analytics/manyeyes/

Google Fusion Tableshttps://support.google.com/fusiontables/answer/2571232?hl=en

Chartsbinhttp://chartsbin.com/

iChartshttp://www.icharts.net/

Just Plain Cool!

http://images5.fanpop.com/image/photos/30600000/The-Fonz-arthur-fonzarelli-30631370-621-362.jpg

CSSeerhttp://csseer.ist.psu.edu/

StreetBumphttp://streetbump.org/

Information is Beautifulhttp://www.informationisbeautiful.net/

Facebook’s Data Science Pagehttps://www.facebook.com/data

Google Trendshttp://www.google.com/trends/

Flowing Datahttp://flowingdata.com/

GapMinderhttp://www.gapminder.org/

One Final Note:Professional Development

SLA Data Caucus initiative!

IASSIST http://www.iassistdata.org/

ASIS&T http://www.asis.org/

LinkedIN Groups see: http://researchguides.library.syr.edu/content.php?pid=484454&sid=4078160

Contact Me

Elaine Lasda Bergmanelasdabergman@albany.eduhttp://www.slideshare.net/librarian68/@ElaineLibrarian on Twitter