BIC$LSU (Big(Data(Research(Integraonwith …€¦ · 5 LSUCyberinfrastructureforBigData Storage...

8
1 NSF CCNIE Integra/on: Bridging, Transferring and Analyzing Big Data over 10Gbps CampusWide SoGware Defined Networks BICLSU (Big Data Research Integra6on with Cyberinfrastructure for LSU) SeungJong (Jay) Park Associate Professor Computer Science Center for Computa/on & Technology Louisiana State University

Transcript of BIC$LSU (Big(Data(Research(Integraonwith …€¦ · 5 LSUCyberinfrastructureforBigData Storage...

Page 1: BIC$LSU (Big(Data(Research(Integraonwith …€¦ · 5 LSUCyberinfrastructureforBigData Storage Server @Vet School 10Gbps Edge OF Switch Pronto 3290 LONI Cisco AS9000 Storage Server

1

NSF  CC-­‐NIE  Integra/on:    Bridging,  Transferring  and  Analyzing  Big  Data  over  10Gbps  Campus-­‐Wide  SoGware  Defined  Networks  

 BIC-­‐LSU  

(Big  Data  Research  Integra6on  with  Cyberinfrastructure  for  LSU)  

 Seung-­‐Jong  (Jay)  Park  

Associate  Professor  Computer  Science  

Center  for  Computa/on  &  Technology  Louisiana  State  University  

Page 2: BIC$LSU (Big(Data(Research(Integraonwith …€¦ · 5 LSUCyberinfrastructureforBigData Storage Server @Vet School 10Gbps Edge OF Switch Pronto 3290 LONI Cisco AS9000 Storage Server

2

Big  Data  Research  at  LSU  q Biology  &  Veterinary  

Ø  Genome  Sequencing  

q Chemistry  Ø  Experiment  &  Simula/on  

q Computer  Science  Ø  Data  Mining  &  Visualiza/on  

q Costal  Science:    Ø  Hazard  Simula/on  &  Modeling  

q Physics  &  Astronomy:  Ø  LIGO  

Fast supercomputer, Big Data requires Large storage,

High speed network

Page 3: BIC$LSU (Big(Data(Research(Integraonwith …€¦ · 5 LSUCyberinfrastructureforBigData Storage Server @Vet School 10Gbps Edge OF Switch Pronto 3290 LONI Cisco AS9000 Storage Server

3

Challenges  @  LSU  

HPC clusters

How  to  Store  

How  to  Transfer  

How  to  Process  

§  Each research lab is located at remote place §  It has slow storages: HDD speed < 1Gbps

§  Network between a Lab and HPC : bandwidth < 1Gbps

§  Massage Passing Interface (MPI) : Hard to program

Page 4: BIC$LSU (Big(Data(Research(Integraonwith …€¦ · 5 LSUCyberinfrastructureforBigData Storage Server @Vet School 10Gbps Edge OF Switch Pronto 3290 LONI Cisco AS9000 Storage Server

4

3  Objec6ves  @  LSU  

HPC clusters

How  to  Store  

How  to  Transfer  

How  to  Process  

1.  Develop 8 SSD Storage Servers = 12TB & 20Gbps I/O Bandwidth

2.  Network between Labs and HPC : bandwidth = 20 Gbps

3.  Develop Virtual Hadoop Cluster

Page 5: BIC$LSU (Big(Data(Research(Integraonwith …€¦ · 5 LSUCyberinfrastructureforBigData Storage Server @Vet School 10Gbps Edge OF Switch Pronto 3290 LONI Cisco AS9000 Storage Server

5

LSU  Cyberinfrastructure  for  Big  Data  

Storage Server @Vet School

10Gbps

Edge OF Switch Pronto 3290

LONI

Cisco AS9000

Storage Server @Chemistry

Storage Server @CCT

Storage Server @Biology

Pluribus Core OF Switch @D Boyd

Aggregation OF Switch Pronto 3780

Hadoop On Demand SuperMike II

@Frey

Hadoop Cluster @Frey

Gene Sequencer

Pluribus Core OF Switch @Frey

Storage Server @Costal

40Gbps

Storage Server @EECS

100Gbps Router @Frey

2 X 10Gbps

40Gbps

Internet2 10Gbps

Collaboration with Samsung For SSD storage servers

Page 6: BIC$LSU (Big(Data(Research(Integraonwith …€¦ · 5 LSUCyberinfrastructureforBigData Storage Server @Vet School 10Gbps Edge OF Switch Pronto 3290 LONI Cisco AS9000 Storage Server

6

Case  Study:  Genome  Sequence  Analysis  

q Human  Genome  Sequencing    Ø  An  NIH  standard  set  of  humane  genome  genome  sequence  has  

470  GB  raw  data  and  requires  more  than  TB  memory  for  assembly  

 

q Hadoop/Giraph-­‐based  soGware  framework  

Ø  Assembling  billions  of  short  reads  into  one  3  billion  base  pair  sequences  

Page 7: BIC$LSU (Big(Data(Research(Integraonwith …€¦ · 5 LSUCyberinfrastructureforBigData Storage Server @Vet School 10Gbps Edge OF Switch Pronto 3290 LONI Cisco AS9000 Storage Server

7

Case  Study:  De  novo  Assembly  q Developing  Giraph/Hadoop  based  De  novo  Assembler  

Page 8: BIC$LSU (Big(Data(Research(Integraonwith …€¦ · 5 LSUCyberinfrastructureforBigData Storage Server @Vet School 10Gbps Edge OF Switch Pronto 3290 LONI Cisco AS9000 Storage Server

8

BIC-­‐LSU:  Milestones  •  1st  year:  

–  2013  Sept:  Project  start  –  2013  Dec:  Constructed  fibers  at  2  sites  (CCT,CS)  –  2013  Mar:  SSD  storage  servers  by  Samsung  –  2013  Apr:  Tes/ng  Openflow  Switches  (PICA8,  HP,  Pluribus)  –  2013  May:  Shipping  SSD  servers  from  Samsung  –  2013  July:  Finish  fibers  at  4  sites  (Bio,  Vet,  Chem,  Coastal)  

•  2nd  year:  –  2013  Aug:  deploy  OF  switches  –  2013  Dec:  develop  a  POX  based  OF  controller    –  2014  Feb:  develop  web-­‐based  Gateway  –  2014  May:  Demonstrate  Genome  Assembly  over  BIC-­‐LSU