Grid Computing 7700 - Center for Computation and Technologygallen/Teaching/Fall2005_7700... ·...
Transcript of Grid Computing 7700 - Center for Computation and Technologygallen/Teaching/Fall2005_7700... ·...
Grid Computing 7700Fall 2005
Lecture 4: Scientific Computing and Hardware
Gabrielle [email protected]
http://www.cct.lsu.edu/~gallen
Basic Elements
CPU CPU
CPU CPU
DISK
Campus Network (LAN)
Machine Network
CPU CPU
CPU CPU
DISK
Campus Network (LAN)
Machine Network
Wide Area Network
Basic Elements
Distributed systems built from– Computing elements (processors)– Communication elements (networks)– Storage elements (disk, attached or networked)
New elements– Visualization/interactive devices– Experimental and operational devices
Distributed Resources
Local workstations CCT Resources Campus/OCS Resources State/LONI Resources National Centers International Colleagues
Laws Moores Law
– Number of transistors on an integrated circuit will double every 18 months– http://en.wikipedia.org/wiki/Moores_law
“Kryders Law”– Hard disk capacity grows quicker than transistors– http://www.sciam.com/article.cfm?chanID=sa006&colID=30&articleID=000B0C2
2-0805-12D8-BDFD83414B7F0000
Gilders Law– Total bandwidth of communication systems doubles every six months
Metcalfe’s Law– Value of a network is proportional to the square of the number of nodes
Amdahl’s Law– Law of diminishing returns, maximum speedup restricted by slowest
parts– http://en.wikipedia.org/wiki/Amdahls_law
Question: So what about applications?
Compute Elements
Moore’s Law: #transistors on a chip (and clockspeed) increase exponentially (double every 18months)– Transistors = 20*2^[(year-1965)/1.5]– 1975 Intel 8080 has 4500 transistors, 100K
intructions/sec– 2003 Pentium IV has 221,000,000, 8 billion
instructions/sec
Corollary: Price of a given level of supercomputingpower halves every 18 months
Price decrease means that supercomputers nowusually built from “commodity” processors– IA32, PowerPC, “emotion engine”
Compute Elements
Clock speed Cache hierarchy Floating point registers Main memory Internal bandwidths Etc, etc Need powerful operating systems,
compilers, applications to leverage all this
Communication Elements
Links, routers, switches, name servers, protocols Infrastructure evolves slowly (politics, large scale changes,
money) Gilder's Law: total bandwidth of communication systems
doubles every six months Change in LAN to desktops
– 100 mbps shared– 100 mbps switched– 1 gbps– 10 gbps
Clusters: GigE (TCP/IP and MPICH/LAM) standard,Myricom/Quadrics (own MPI drivers) better performance,infiniband/fibrechannel different architecture
Network Speeds
Analog modem: 57 kbps GPRS: 114 kbps Bluetooth: 723 kbps T-1: 1.5 Mbps Eth 10Base-X: 10Mbps 802.11b (WiFi) 11 Mbps T-3: 45 Mbps OC-1: 52 Mbps Fast Eth 100Base-X: 100
Mbps
OC-12: 622 Mbps GigEth 1000Base-X: 1 Gbps OC-24: 1.2 Gbps OC-48: 2.5 Gbps OC-192: 10 Gbps 10 GigEth: 10 Gbps OC-3072: 160 Gbps
My Cox Cable– Upload: 35 KB/s– Download 250 KB/s
CCT “is” to supermike– Up/down: 5000 KB/s
Communication Elements
$30005003005Quadrics
$10005002809Myrinet
$100~130~65100GigabitEthernet
Approximatecost per port
BidirectionalBandwidth(mbps)
PeakBandwidth(mbps)
ShortMessageLatency(microsec)
InterconnectType
Storage Elements
Magnetic tape/Magnetic disk Magnetic disk
– Properties: density/rotation/cost– 1970-1988 density improvements 29% per year– 1988-now density improvements 60% per year– Standard in PCs: 500mb (1995), 2gb(1997), 100gb (2002)– Performance not increasing so fast
• Peak transfer (~100mbs)• Seek times (3-5ms) [bottleneck]
Grids: cost of storage neglibable, high speednetworks make large data libraries attractive
The Future (??)
100 tb/s20 pb160 tb640 t-op/s2008 SC
10 gb/s2 tb16 gb64 g-op/s2008 PC
10 tb/s1280 pb50 tb80 t-op/s2003 SC
100 gb/s32 tb256 gb512 g-op/s2013 PC
1 pb/s320 pb2.6 pb5 p-op/s2013 SC
1 gb/s128 gb512 mb8 g-op/s2003 PC
NetworkDiskMemoryComputeMachine
1 mega = 10^61 giga = 10^91 tera = 10^121 peta = 10^15
TeraGrid:40 TFlop/s6 TB memory1 Petabytes storage10 Gigabits/s
Earth Simulator:40 TFlop/s10 TB memory2.5 Petabytes storage13 Gigabits/s
DOE BlueGene:367 TFlop/s16 TB memory400 Terabyte storage
Supercomputers Definition of supercomputer
– Machine on top500.org ?• http://www.top500.org/lists/plists.php?Y=2005&M=06
– Machine costing over $1M ?– Basically highest end machines
Top 3 (2005)– DOE BlueGene/L (USA) 66K procs/137 TF– IBM BGW (USA) 41K procs/91 TF– NASA Columbia (USA) 10K procs/52TF
Top 3 (2003)– Earth Simulator (JAPAN) 5K procs/36 TF (6)– ASCI Q (USA) 8K procs/14 TF (12)– G5 Cluster (USA) 2k procs/12 TF (14)
Others– 18 IBM (China)– 147 Supermike (LSU !!!)
www.webopedia.com
The fastest type of computer.Supercomputers are very expensive andare employed for specializedapplications
that require immense amounts ofmathematical calculations. For example,
weather forecasting requires asupercomputer. Other uses of
supercomputers include animatedgraphics, fluid dynamic calculations,
nuclear energy research, and petroleumexploration.The chief difference betweena supercomputer and a mainframe is thata supercomputer channels all its power
into executing a few programs as fast aspossible, whereas a mainframe uses its
power to execute many programsconcurrently.
Architectural Classes
Flynn (1972): classification based on the way systemmanipulates instruction and data streams:
SISD Single Instruction Single Data– One instruction stream executed serially.– Conventional workstations
SIMD Single Instruction Multiple Data– Large (many thousands) number of processing units– All execute same instruction on different data in lockstep– Vector processors (NEC SX-6i) acting on arrays of data
MISD Multiple Instruction Single Data– No machines built
MIMD Multiple Instruction Multiple Data– Different to SISD because instructions/data are related
More Classification
Shared Memory Systems– Multiple CPUs sharing same address space– One memory accessed by all processors equally– Location of data not important to user– Can be SIMD (single processor vector processor) or MIMD– OpenMP http://www.openmp.org/index.cgi?faq
Distributed Memory Systems– Each CPU has own memory– CPUs are connected by network– Location of data important– Can be SIMD (lock step example before) or MIMD (large
variety of network topologies)– Distributed processing takes DM-MIMD to extreme
Message Passing
Essential for DM machines, but often alsoused for SM machines for compatibility– MPI Message Passing interface– PVM Parallel Virtual Machine
DM-MIMD
Fast growing section, best performance. Need to balancecomputation and communication performance in machinedesign (and upgrades)
User has to distribute data between processors User has to perform data exchange between processors
explicitly Slow compared to SM machines to access data on other
processors Programming models/algorithms important Programming environments can make this easier (e.g. Cactus
Framework http://www.cactuscode.org handles datadistribution, communications, IO, …)
Same programming models need to be extended to Gridcomputing
ccNUMA
Cache Coherent Non Uniform Memory Access Build systems from SMPs (symmetric
multiprocessing nodes) SMPs consist of up to ~16 processors connected
by a crossbar which share same memory Each node is a SM-MIMD, but with different
memory access times for different processors(memory is physically distributed)
Nodes then connecting in a different way Computational scientists like these machines
DM-MIMD
Processor topology and interconnects veryimportant– Hypercube (with 2^d nodes number of steps between
two nodes at most d, possible to simulate othertopologies)
– Fat tree (simple tree structure with more connections athigher levels to ease conjestion)
– 2D/3D mesh structure (many apps map well to this,avoids expense)
– Crossbars (connecting up to around 64 processors, canbe hierarchical)
Details should be hidden from applicationprogrammers, but for performance need to beaware
Virtual Shared Memory
Kendall Square Research Systems tried toimplement at hardware level
High Performance Fortran– HPF Specification 1993– Simulates a virtual shared memory at a software level– Programming directives distribute data across
processors– Looks like shared memory machine to user
Some vendors have propriety virtual sharedmemory programming models by providing globaladdress space
Network Eras
Past (1969-1988)– ARPANET/NSFNET
Current (1988-2005) Future (2005-)
Historical network maps– http://www.cybergeography.org/atlas/historical.html
Network Infrastructure
Chapter 30 (The Grid 2) Network infrastructure is the foundation
on which Grids are built Composition of local and wide area
services, transport protocols and services,routing protocols and network services,link protocols and physical media
One example of network infrastructure inthe Internet (core protocols TCP/IP)
Protocol Agreed-upon format for transmitting data between two devices
which determines:– The type of error checking to be used– Any data compression method– How sending device indicates it has finished sending a message– How receiving device indicates it has received a message
Various standard protocols: differ in simplicity, reliability,performance.
Computer/device must support the right ones to communicatewith other computers.
Implemented either in hardware or in software http://www.protocols.com/protocols.htm
Slow to Change
Internet has not changed much since 1983 (when TCP/IPdeployed), which does make is stable, but still don’t really haveenvisaged services:– Multicast (one-to-many communication)– Network Reservation– Quality of Service
New protocols peer-to-peer file sharing and instant messaging New technology coupled to applications drive change: e-mail,
web/file-sharing, video streaming
Past: 1969-1988 ARPANET (1969) 56-kbps lines
– Experiment to investigate resourcesharing and remote access
– Added interface message processor(IMP) at each end of network (ourrouters), provided flexibility forlower levels and higher levelapplications
– Success from: freely availabledocumentation and source code;software bundled with newmachines; use for teaching;community development vs.proprietary
NSFNET (1985) 45-mpbs lines– Connect academic HPC centers
ARPANET: 1971
ARPANET: 1980
NSFNET: 1991
Past: 1969-1988
Driving application: e-mail, remote file access,remote job control (drove basic protocols)
Network technology: WAN links lines leased fromtelephone companies. Xerox Palo Alto ResearchCenter (PARC) created Ethernet (3 mbps)(alternatives token ring (IBM), …). Workstationsappear bundled with network protocols. PCs onthe network as interface costs dropped andprocessors became more powerful.
Past: 1969-1988
Protocols and Services– telnet, file transfer protocol, e-mail– Underlying transport protocol TCP (stream of
bytes which can be opened or closed, data canbe sent or received)
– Machine location: Domain Name System (DNS)(replaced list of named files)
• Hierarchical, distributed, redundant
Past: 1969-1988
System Integration– ARPANET: assumed central network operations center– NSFNET: introduced hierarchical system, toplevel backbone
network connecting to regional networks connecting tocampuses
Packet switching strategy was important (using computingpower to optimize communication)
Single communication model was important because itallowed so many people to be connected driving futuredevelopment.
Present: 1988-2005
Internet today: complex structure ofbackbone networks and regional networks
Increased role of private sector (e.g.AT&T, BellSouth), who basically controlour network now.
LSU Campus
LANet
Louisiana statewide network:Office of TelecommunicationsManagement, state agencies,higher education: 6Mbps ->$2450 a month
http://www.state.la.us/otm/lanet/
Quest
Bell South
Baton Rouge: 4 DS3 to New Orleans, 1 DS3 to Houston
Abeline (Internet2)
http://abilene.internet2.edu/maps-lists/Traffic: http://loadrunner.uits.iu.edu/weathermaps/abilene/
National Lambda Rail
http://www.nationallambdarail.org/architecture.html
National Lambda Rail
Global Terabit Research Network
Required Reading
Overview of Recent Supercomputers– http://www.euroben.nl/reports/overview05a.pdf
Concentrate on pages 1 to 32, you do not need tolearn this, just get an appreciation of theconcepts.