Building the world with Elastic Map Reduce

apps on maps...

Building the World with Elastic Map Reduce

Oliver Norton, Technical DirectorTim Jenks, Technical Lead

appsonmaps.com

AS3 SDK – Apps in Browser

JS API – Embed on your website

IOS SDK – Mobile, coming Q4 2012

social media

social commerce

traffic updates

journey planning

worldflightclub.com

flying like a bird

World Flight Club on YouTube

http://youtu.be/5bJPNjiYy40

http://youtu.be/5bJPNjiYy40

http://www.worldflightclub.com/

flipping the bird!

Flipper on YouTube

http://www.youtube.com/watch?v=E0lZee6cGxg&feature=youtu.be



photographic maps

Photographic-based maps ….

layered data

We fuse layered data to procedurally generate our maps(using AWS’ Elastic Map Reduce)

streamed

All built & served from off-the-shelf Amazon Web Service infrastructure

pipeline

data size

Over 2TB Data

Terrain:GB (10m, ¼ million Km2)US (10m, 40x GB)

Buildings :GB (full coverage)

US (120 cities)

Roads: GB (¼ million miles)

US (4 million miles)

Processing this can start to be expensive $$$

before

• Limited scalability -> 60 desktop spec machines

• Multi-TB SAN with a £10k/year maintenance cost

• In house build that needed maintaining

• 10mbit/sec symmetric internet to upload TBs of data

• 3 developers knew how to run builds

• Electric costs -> who knows…

now

On Amazon Elastic Map Reduce

now

• Scalability -> 800 m1.large instances

• Off shelf tech that’s discoverable (hadoop, MRJob)

• Maintenance reduced

• Data is already in cloud (source, and destination)

• More predictable costs (and happier costs, with spot pricing)

• DevOps benefits: Now any engineer can write and run jobs, not just 3

pipeline

AWS

S3AWS

EMR AWS

S3AWS

CloudFront

AWS

EC2

mrjob

• MRjob from Yelp

• http://github.com/Yelp/mrjob

https://github.com/Yelp/mrjob



800 machines in 20linesclass MyMapReduceJob(MRJob): def mapper_init(self): self.__mapper = # wire up mapper

def mapper(self, key, line): # perform map work for key, value in self.__mapper.map(line, None): yield str(key), value

def reducer_init(self): self.__reducer = # wire up reducer

def reducer(self, key, values): # perform reduce work result = self.__reducer.reduce(key, values) if result: yield key, "built successfully" else: yield key, "failed"

if __name__ == '__main__': MyMapReduceJob.run()

amazon emr

Processing Complexity X Data Size

thanks

appsonmaps.com

Twitter: @eeGeo

Building the world with Elastic Map Reduce

Technology

Transcript of Building the world with Elastic Map Reduce