High Performance Computing in Our Everydays

High Performance Computing in Our Everydays

High Performance Computing in OurEverydays

Peter Wittek

Swedish School of Library and Information ScienceUniversity of Boras

10/10/11


Outline

1 What Is New in HPC?

2 Supporting Frameworks

3 Computational Requirements of Digital Libraries

4 A Workflow in Cloud HPC

5 Experimental Results

6 Open Issues

7 Conclusions


What Is New in HPC?

Cloud HPC

Cloud computing: think of it as a utilityE.g., you get to use 10 small computer instances for $0.82an hour

Your computer instances do not necessarily correspond toactual computers

VirtualizationDemo: ReactOS

Latest contestant in cloud computing: HPCNot ordinary computer instances


What Is New in HPC?

Massive Parallelism

Figure: Floating-Point Operations per Second for the CPU and GPU


What Is New in HPC?

Massive Parallelism

Cache

ALUControl

ALU

ALU

ALU

DRAM

CPU

DRAM

GPU

Streaming hardwareExplicit memory management


What Is New in HPC?

Massive Parallelism

Parallel versus distributed computingDistributed nodes do not share the memory:

Connected through network;Calculations may run in a parallel fashion;Other nodes do not see what one node has computed;Nodes may fail.


What Is New in HPC?

Why You Should Care

Digital libraries and HPC?No need for upfront investment;Go beyond full-text search;Machine learning;Pattern matching;Social media and graph mining;

You can define a new fieldFreedom


Supporting Frameworks

Why Is Distributed Computing Hard?

Take an example: creating an inverted indexAn inverted index is at the core of search enginesA simple example:

term1: (doc1,freq11), (doc5,freq51)term2: (doc1,freq12), (doc3,freq32), (doc6,freq62)

Naıve approach to parallelize:Have an indexer at each node;Distribute documents to nodes;Let nodes broadcast the lists (Message Passing Interface –MPI).



MapReduce

Published in 2004 by Google researchersSince then it has become widespread in data-intensiveprocessingCore idea: keep things simple, you can do two things:

Map: Send out chunks of data and then do something onthemReduce: Collect chunks of data and do something on themwhile collecting

Intermediate data structure: key-value pairsThe framework should also take care of the mundanetasks, such as failing nodes, network latency, etc.



A MapReduce Inverted Indexer

The task is: formulate your problem in MapReduce termsMap: gets a chunk of text. Emits:

Key: termValue: document id and corresponding frequency

Reduce: Merges by keyThere might be a different number of map and reducetasks



Another MapReduce Example

Sometimes it is worth bypassing the reduce phaseThen we do not need to emit key-value pairs at all

Distributed GPU random projection



Exploiting GPU Resources

Low-level frameworks: CUDA and OpenCLThey certainly do not make GPUs much friendlierHigher-level libraries: BLAS, cuSPARSEAs long as you know maths. . .



Overcoming GPU Obstacles

GPU MapReduceAcademic projects: Mars, GPMR

GPU-aware MapReduce: extend existing frameworksDevelop extensive middleware


Computational Requirements of Digital Libraries

Digital Preservation

Future-proofing document collectionsEmulationMigration

Workflows are often tremendously compute-intensive


Computational Requirements of Digital Libraries

Machine Learning and Advanced Services

Digital collections and social networksA step towards digital curation

SaaS approach to digital curation

Indexing by Lucene/NutchCollection-level metadata extraction by Mahout


A Workflow in Cloud HPC

A Middleware Architecture

SupportServices:-Documentprocesses-Contextsearch-Datamining

Map

Red

uce

En

gin

e

PolicyEnforcement

ArchivalStorageInterface

Middleware

Grid or Cloud Storage Grid or Cloud Computing

A middleware to make adoption by DL practitioners easierMoving towards computational science


Experimental Results

Cost

1 4 10 20 40 80

Number of Processing Cores

0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08Avera

ge C

ost

in U

SD

100100010000

Figure: Comparison of average cost of computations with differentcollection sizes


Experimental Results

Running time

1 4 10 20 40 80

Number of Processing Cores

0

1000

2000

3000

4000

5000

6000

7000

8000R

unnin

g T

ime (

Min

s)

100100010000

Figure: Comparison of running times with different collection sizes


Open Issues

Obstacles to Adoption

Persistence and high-reliabilityMapReduceNot just a technological issue

Service-level agreementParticularly problematicAnother EU FP7 project working on it: SLA@SOINiche for alternative cloud providers

Difficulty of integration


Conclusions

Acknowledgment

Work has been funded by Sustaining Heritage Accessthrough Multivalent ArchiviNg (SHAMAN), an EU FP7large integrated project.http://shaman-ip.eu/shaman/

Additional funding has been received from Amazon WebServices.http://aws.amazon.com/

http://shaman-ip.eu/shaman/

http://aws.amazon.com/


Conclusions

Summary

Cloud and HPC: a solution looking for a problemDigital libraries

Computational requirementsExpertiseComplexity and integration

Contact: [email protected]

High Performance Computing in Our Everydays

Documents

Transcript of High Performance Computing in Our Everydays