Hadoop - Lessons Learned

Post on 15-Jan-2015

2.341 views 6 download

Tags:

description

Hadoop has proven to be an invaluable tool for many companies over the past few years. Yet it has it's ways and knowing them up front can safe valuable time. This session is a run down of the ever recurring lessons learned from running various Hadoop clusters in production since version 0.15. What to expect from Hadoop - and what not? How to integrate Hadoop into existing infrastructure? Which data formats to use? What compression? Small files vs big files? Append or not? Essential configuration and operations tips. What about querying all the data? The project, the community and pointers to interesting projects that complement the Hadoop experience.

Transcript of Hadoop - Lessons Learned

Hadooplessons  learned

@tcurdtgithub.com/tcurdt

yourdailygeekery.com

Data

hiring

Agenda

· hadoop?  really?  cloud?· integration· mapreduce· operations· community  and  outlook

Why  Hadoop?

“It is a new and improved version of enterprise tape

drive”

20  machines20  files,  1.5  GB  each

grep “needle” file

hadoop job grep.jar

0 17.5 35.0 52.5 70.0

unfair

Map  Reduce

Run  your  own?

http://bit.ly/elastic-mr-pig

Integration

black  box

· hadoop-cat

· hadoop-grep

· hadoop-range --prefix /logs --from 2012-05-15 --until 2012-05-22 --postfix /*play*.seq | xargs hadoop jar

· streaming  jobs

Engineers

· mount  hdfs

· pig  /  hive

· data  dumps

Non-Engineering  Folks

Map  Reduce

InputFormat

HDFS files

Split

Map

Combiner

Partitioner

Copy and Merge

Reducer

OutputFormat

Reducer

Sort

Split

Map

Combiner

Sort

Split

Map

Combiner

Sort

Split

Map

Combiner

Sort

Combiner Combiner

MAPREDUCE-346  (since  2009)

12/05/25 01:27:38 INFO mapred.JobClient: Reduce input records=106..12/05/25 01:27:38 INFO mapred.JobClient: Combine output records=40912/05/25 01:27:38 INFO mapred.JobClient: Map input records=11270584412/05/25 01:27:38 INFO mapred.JobClient: Reduce output records=412/05/25 01:27:38 INFO mapred.JobClient: Combine input records=64842079..12/05/25 01:27:38 INFO mapred.JobClient: Map output records=64841776

map in : 112705844 *********************************map out : 64841776 *****************combine in : 64842079 *****************combine out : 409 |reduce in : 106 |reduce out : 4 |

Job  Counters

map in : 20000 **************map out : 40000 ******************************combine in : 40000 ******************************combine out : 10001 ********reduce in : 10001 ********reduce out : 10001 ********

Job  Counters

mapred.reduce.tasks = 0

Map-only

public class EofSafeSequenceFileInputFormat<K,V> extends SequenceFileInputFormat<K,V> { ...}

public class EofSafeRecordReader<K,V> extends RecordReader<K,V> { ... public boolean nextKeyValue() throws IOException, InterruptedException { try { return this.delegate.nextKeyValue(); } catch(EOFException e) { return false; } } ...}

EOF  on  append

ASN1, custom java serialization, Thrift

Serialization

before

now

protobuf

public static class Play extends CustomWritable {

public final LongWritable time = new LongWritable();

public final LongWritable owner_id = new LongWritable();

public final LongWritable track_id = new LongWritable();

public Play() { fields = new WritableComparable[] { owner_id, track_id, time }; }}

Custom  Writables

BytesWritable bytes = new BytesWritable();...byte[] buffer = bytes.getBytes();

Fear  the  State

public void reduce( LongTriple key, Iterable<LongWritable> values, Context ctx) {

for(LongWritable v : values) { } for(LongWritable v : values) { }}

public void reduce( LongTriple key, Iterable<LongWritable> values, Context ctx) { buffer.clear(); for(LongWritable v : values) { buffer.add(v); } for(LongWritable v : buffer.values()) { }}

Re-Iterate

HADOOP-5266  (applied  to  0.21.0)

long min = 1;long max = 10000000;

FastBitSet set = new FastBitSet(min, max);

for(long i = min; i<max; i++) { set.set(i);}

BitSets

org.apache.lucene.util.*BitSet

Data  Structures

http://bit.ly/data-structureshttp://bit.ly/bloom-filtershttp://bit.ly/stream-lib

General  Tips

· test  on  small  datasets,  test  on  your  machine

· many  reducers

· always  consider  a  combiner  and  partitioner

· pig  /  streaming  for  one-time  jobs,java/scala  for  recurring

http://bit.ly/map-reduce-book

Operations

pdsh -w "hdd[001-019]" \"sudo sv restart /etc/sv/hadoop-tasktracker"

runit  /  init.d

pdsh  /  dsh

use  chef  /  puppet

Hardware

· 2x  name  nodes  raid  1

· 12  cores,  48GB  RAM,  xfs,  2x1TB

· n  x  data  nodes  no  raid

· 12  cores,  16GB  RAM,  xfs,  4x2TB

Monitoringdfs.class=org.apache.hadoop.metrics.ganglia.GangliaContext31dfs.period=10dfs.servers=...

mapred.class=org.apache.hadoop.metrics.ganglia.GangliaContext31mapred.period=10mapred.servers=...

jvm.class=org.apache.hadoop.metrics.ganglia.GangliaContext31jvm.period=10jvm.servers=...

rpc.class=org.apache.hadoop.metrics.ganglia.GangliaContext31rpc.period=10rpc.servers=...

# ignoreugi.class=org.apache.hadoop.metrics.spi.NullContext

Monitoring

total  capacity capacity  used

Compression

#  of  64MB  blocks#  of  bytes  needed#  of  bytes  used#  bytes  reclaimed

bzip2  /  gzip  /  lzo  /  snappyio.seqfile.compression.type = BLOCKio.seqfile.compression.blocksize = 512000

Janitor

hadoop-expire -url namenode.here -path /tmp -mtime 7d -delete

The last block of an HDFS block only occupies the required space. So a 4k file only consumes 4k on disk.-- Owen

BUSTED

find \ -wholename "/var/log/hadoop/hadoop-*" \ -wholename "/var/log/hadoop/job_*.xml" \ -wholename "/var/log/hadoop/history/*" \ -wholename "/var/log/hadoop/history/\\.*.crc" \ -wholename "/var/log/hadoop/history/done/*" \ -wholename "/var/log/hadoop/history/done/\\.*.crc" \ -wholename "/var/log/hadoop/userlogs/attempt_*" \ -mtime +7 \ -daystart \ -delete

Logfiles

Limits

hdfs hard nofile 128000hdfs soft nofile 64000mapred hard nofile 128000mapred soft nofile 64000

fs.file-max = 128000

sysctl.conf

limits.conf

Localhost

127.0.0.1 localhost localhost.localdomain127.0.1.1 hdd01

127.0.0.1 localhost localhost.localdomain127.0.1.1 hdd01.some.net hdd01

before

hadoop

Rackaware

<property> <name>topology.script.file.name</name> <value>/path/to/script/location-from-ip</value> <final>true</final></property>

#!/usr/bin/rubylocation = { 'hdd001.some.net' => '/ams/1', '10.20.2.1' => '/ams/1', 'hdd002.some.net' => '/ams/2', '10.20.2.2' => '/ams/2',}

puts ARGV.map { |ip| location[ARGV.first] || '/default-rack' }.join(' ')

site  config

topology  script

for f in `hdfs hadoop fsck / | grep "Replica placement policy is violated" | awk -F: '{print $1}' | sort | uniq | head -n1000`; do hadoop fs -setrep -w 4 $f hadoop fs -setrep 3 $fdone

Fix  the  Policy

hadoop fsck / -openforwrite -files | grep -i "OPENFORWRITE: MISSING 1 blocks of total size" | awk '{print $1}' | xargs -L 1 -i hadoop dfs -mv {} /lost+notfound

Fsck

Community

hadoop

*  from  markmail.org

Community

The  Enterprise  Effect

“The  Community  Effect”  (in  2011)

Community

mapreduce

core

*  from  markmail.org

The  Future

real  timeincremental

flexible  pipelinesrefined  API

refined  implementation

Real  Time  Datamining  and  Aggregation  at  Scale  (Ted  Dunning)

Eventually  Consistent  Data  Structures  (Sean  Cribbs)

Real-time  Analytics  with  HBase  (Alex  Baranau)

Profiling  and  performance-tuning  your  Hadoop  pipelines  (Aaron  Beppu)

From  Batch  to  Realtime  with  Hadoop  (Lars  George)

Event-Stream  Processing  with  Kafka  (Tim  Lossen)

Real-/Neartime  analysis  with  Hadoop  &  VoltDB  (Ralf  Neeb)

Take  Aways

·use  hadoop  only  if  you  must·really  understand  the  pipeline·unbox  the  black  box

@tcurdtgithub.com/tcurdt

yourdailygeekery.com

That’s  it  folks!