NDN-Hadoop: Exploring Applicability of NDN for Big...

1
NDN-Hadoop: Exploring Applicability of NDN for Big-Data Computing Chris Gniady and Beichuan Zhang Department of Computer Science, University of Arizona [email protected], [email protected] Datacenter Networks Are Challenging Densely connected Lots of engineering to maximize performance before hitting limits Alternative protocols may simplify system and applications Evolution of Communication Abstraction Developed for the Internet Efficient multicast In-network caching Failure recovery Telephony: name the path IP: name the endpoint Name Data Network (NDN): name the data NDN Benefits for Big Data NDN provides network optimizations Simplifies development of Big Data applications NDN provides data multicast Benefits data replication in Big Data storage NDN provides in network caching Improves performance in Big Data computing NDN retrieves data from alternate sources Transparent failure recovery in Big Data systems 800 lines of code eliminated Several corner cases Minimal changes in current code for comparison Further code simplification in optimized applications Exploring NDN in a Datacenter Simpler Code Goal: Understand applicability and benefits Apache Hadoop as a Big Data system Small Scale Test System Multicasting in Datacenter Standard IP: Pipelined No overlap in traffic High bandwidth NDN: Multicast Lower bandwidth More Efficient Replication In Network Caching

Transcript of NDN-Hadoop: Exploring Applicability of NDN for Big...

Page 1: NDN-Hadoop: Exploring Applicability of NDN for Big …esaule/NSF-PI-CSR-2017-website/posters/...NDN-Hadoop: Exploring Applicability of NDN for Big-Data Computing. ... • Apache Hadoop

NDN-Hadoop: Exploring Applicability of NDN for Big-Data ComputingChris Gniady and Beichuan Zhang

Department of Computer Science, University of [email protected], [email protected]

Datacenter Networks Are Challenging

• Densely connected• Lots of engineering to maximize performance before hitting limits

• Alternative protocols may simplify system and applications

Evolution of Communication Abstraction

• Developed for the Internet• Efficient multicast • In-network caching• Failure recovery

Telephony: name the path IP: name the endpoint

Name Data Network (NDN): name the data

NDN Benefits for Big Data

• NDN provides network optimizations• Simplifies development of Big Data applications

• NDN provides data multicast • Benefits data replication in Big Data storage

• NDN provides in network caching • Improves performance in Big Data computing

• NDN retrieves data from alternate sources• Transparent failure recovery in Big Data systems

• 800 lines of code eliminated• Several corner cases• Minimal changes in current code for comparison• Further code simplification in optimized applications

Exploring NDN in a Datacenter Simpler Code

• Goal: Understand applicability and benefits• Apache Hadoop as a Big Data system

Small Scale Test System

Multicasting in Datacenter

Standard IP:• Pipelined • No overlap in traffic• High bandwidth

NDN:• Multicast• Lower bandwidth

More Efficient Replication

In Network Caching