Introduction to SARA's Hadoop Hackathon - dec 7th 2010
-
Upload
evert-lammerts -
Category
Technology
-
view
1.310 -
download
3
description
Transcript of Introduction to SARA's Hadoop Hackathon - dec 7th 2010
SARA Hadoop [email protected] 7, 2010
SARA Hadoop Hackathon, December 7, 2010
DJOERD HIEMSTRA(UTwente)
EDGAR MEIJ(UvA)
SARA Hadoop Hackathon, December 7, 2010
Nutch*2002 2004
MR/GFS**20062004
Hadoop
* http://nutch.apache.org/** http://labs.google.com/papers/mapreduce.html http://labs.google.com/papers/gfs.html
SARA Hadoop Hackathon, December 7, 2010
http://wiki.apache.org/hadoop/PoweredBy
2010: A Hype in Production
SARA Hadoop Hackathon, December 7, 2010
Super computingSuper computing
Cluster computingCluster computing
Grid computingGrid computingCloud computingCloud computing
GPU computingGPU computing
http://www.sara.nl/
SARA Hadoop Hackathon, December 7, 2010
ComputationExpensive!
:-(:-)
DataCheaper!
Data
Computation
Ref: Luiz André Barroso and Urs Hölzle, Google Inc. The Datacenter as a Computer: An Introduction to the Design of WarehouseScale Machines
SARA Hadoop Hackathon, December 7, 2010
DN TT DN TT DN TT DN TT
DN TT DN TT DN TT DN TT
NameNode JobTracker
DN
TT
DataNode
TaskTracker
SARA Hadoop Hackathon, December 7, 2010
File Map ReduceShuffle Output
$ echo “${email#*@}, ${name}” $ sort $ wc l
ewi.utwente.nl, 1gmail.com, 2nbic.nl, 1nikhef.nl, 3sara.nl, 1
SARA Hadoop Hackathon, December 7, 2010
From: Hadoop, The Definitive Guide (2nd Edition), Tom White
SARA Hadoop Hackathon, December 7, 2010
Today
09.30 - 09.50 Welcome & Introduction09.50 - 10.15 Map/Reduce @ University of Twente10.15 - 10.30 Kick-off hackathon14.00 - 15.00 Optional: SARA tour10.30 - 17.00 Hackathon17.00 - 17.30 Results and closing