Introduction Computer Science Henri Bal Vrije Universiteit Amsterdam
description
Transcript of Introduction Computer Science Henri Bal Vrije Universiteit Amsterdam
Introduction Computer Science
Henri Bal
Vrije Universiteit Amsterdam
Goals of this course
● Understand typical Computer Science topics● Meet with students and some staff
members● Develop skills:
● Reading (English) scientific literature● Critical/analytical thinking about CS topics● Discussing● Presenting● Scientific writing
Structure
● Tuesdays: guest lectures● 2 scientific papers provided as context● Questions made up by lecturers
beforehand
● Thursday/Friday/Monday: working groups● 2 students per group present a paper● Each group discusses both papers +
questions
Topics (Tuesday lectures)
● Intro & high-performance computing (Henri Bal)
● Finding & reading scientific literature (Michel Klein, with LI & IMM students)
● e-Science infrastructures (Cees de Laat)● e-Health (Aart van Halteren)● Astronomy & manycores (Rob van Nieuwpoort)● Watson (Lora Aroyo, with LI & IMM students) ● Luggage handling at Heathrow Terminal 5
(Huub van der Wouden, with IMM students)
Working Groups● Supervised by staff members (instructors)● First meeting:
● Instructors will present 1 paper, you do the discussions
● Other meetings:● Students present/discuss papers
● Course material + working group composition will be made available on Blackboard (bb.vu.nl)
Your tasks● Attend Tuesday lectures
● Send brief answers to questions + pose 2 new questions per paper before workgroup deadline
● Give 1 presentation in a working group● Make slides, talk for 10-15 minutes
● Participate in working group discussions
● Write 2-page paper on 1 topic of your choice● Use (find!) 2 extra publications in the literature
● Grading:● 40% participation, 40% paper, 20% presentation
First presentation
● My personal view on Computer Science● Why is Computer Science so interesting?
● Biased towards my own research area:● High performance distributed computing
Computer Science (CS)
● CS sits between technology and applications, both of which have turbulent developments● Processors, networks, mobiles, wearables, …
● Data explosion in virtually all applications
● CS also studies many fundamental problems of its own● Programming languages, security, AI, theory ….
Outline● Technology
● Computers● Some history● High performance computers● Modern (multicore) PCs
● Networks & mobile computing
● Applications● Data explosion● Computation demands
● Fundamental CS questions
Computers● Mainframe: powerful centralized computer
● IBM 704 (1964)
● Minicomputers: <25K$, for small groups● PDP-8, PDP-11, VAX (1960s-1980s)
● Workstations: expensive personalgraphical machine
● Xerox Alto (1973)
● PCs: inexpensive machine for the masses● IBM PC (1981)
High Performance Computers
● Computer systems with many processors, all computing in parallel
● Paper: “Back to Thin-Core Massively Parallel Processors”
Warning
● Scientific papers may be overwhelming● Have to learn how to read scientific
literature, without understanding every word
● ‘’Moreover, smart algorithms that exploit data locality, perform loop unrolling, eliminate iterative loops and recursive algorithms, and use idle-power-friendly programming languages and libraries as well as auto-tuning based on multiversion algorithms can achieve higher-energy-efficiency applications.’’
● (You’re not supposed to understand this yet!)
High Performance Computers (1)
● Vector machines● Can do vector operations in parallel
● A and B: 1-dimensional matrices with 100 elements● Computing A+B (= 100 computations) takes as
much time as doing 1 addition on a sequential computer
● History● 1970s, 1980s (e.g., Cray)● 2000s (Japanese Earth Simulator)● 2010s (GPUs, Graphical Processing Units)
High Performance Computers (2)
● Massively parallel machines● 1000s of special processors connected by
a special network, all running in parallel, each doing part of the overall computations
● E.g., CM-1, CM-5, Intel Paragon, IBM BlueGene● Connection network uses graph theory
(math)
High Performance Computers (3)
● Cluster computers● Parallel machines built from off-the-shelf
(commodity) PCs and networks● Excellent price/performance ratio
● Exponential performance growth ofprocessor speeds
● See http://www.top500.orgfor 500 fastest supercomputers
Multicores & Manycores
● All PCs now have >1 compute cores● Every PC is a parallel computer!
● Some PCs already have 48 cores● Core count will increase to hundreds● GPUs (manycores): 1000’s very simple
cores● Intel Phi (2012): 60 Pentium-1’s on 1 chip,
with advanced vector support● Challenge: how to program these things?
Thinking in parallel is hard
● How to split up the work?● Load balancing
● All cores should do the same amount of work
● Communication & synchronization● Cores must exchange data (=overhead)
● Nondeterminism:● A single processor always gives same outcome● With >1 core the outcome may depend on the
order (called a ``race condition’’ bug)
Current debates
● Should we build chips with:● Very fast/complicated (superscalar)
processors?● Hits a ‘’power wall’’, hard to increase clock
frequency● Many slower/simpler (thin) processors?
● Hard to program
● How to deal with energy consumption?● Performance per Watt becomes key factor
Networks
● Wide area networks (WANs)● Local area networks (LANs)● Mobile networks
● Much more in Computer Networks class
Wide area networks
● ARPANET● First computer network, connecting some US
sites (1960s) ● Speeds measured in kbit/s
● Internet● Based on standardized (IP) protocol suite● Connect everyone/everything (Internet-of-
things)
● Dedicated optical networks (light paths)● 10 gbit/s, point-to-point
Local Area Networks
● Ethernet: developed by Xerox PARC (1974)● Speed increased from 10 mbit/s to 100
gbit/s
● Cluster computers use Ethernet or faster commodity networks● Myrinet● Infiniband
An aside
● In Computer Science● k(ilo)=1024● m(ega)=10242
● g(iga)=10243
● t(era)=10244
● p(eta)=10245
● e(xa)=10246
● All has to do withbinary numbers
DAS-4Dual quad-core Xeon E5620 24-48 GB memory1-10 TB diskInfiniband + 1Gb/s EthernetVarious accelerators (GPUs, multicores, ….)Scientific LinuxBuilt by ClusterVision
VU (74)
TU Delft (32) Leiden (16)
UvA/MultimediaN (16/36)
SURFnet6
10 Gb/s light pathsASTRON (23)
Mobile computing
● Laptops, sensors, smartphones, tablets● Many forms of mobile networks
● Wifi (local range)● 3G, 4G (lower bandwidth, high coverage)● BlueTooth (for pairing devices)
● Ultimately: ubiquitous computing?● Vision by Mark Weiser (1988)● ‘’machines that fit the human environment
instead of forcing humans to enter theirs’’
Outline● Technology
● Computers● Some history● High performance computers● Modern (multicore) PCs
● Networks & mobile computing
● Applications● Data explosion● Computation demands
● Fundamental CS questions
Application developments
● There is a ``data explosion’’ in many application areas● Huge amounts of data (up to
Petabytes/year)● Very complicated/heterogeneous data
● Demand for computing● Model (simulate) designs on a computer
Data explosion
● Society:● Web, social networks
● Industry, economy:● Banks, stock markets
● Science● LHC (``Higgs particle’’)
● Data stored on world-wide ``grid’’● Bioinformatics (next generation sequencing)● Astronomy: software telescopes (LOFAR, SKA)
Computing demands● Computational science:
● Modeling ozone layer, climate, ocean, human brain● Simulating galaxies
● Engineering:● Aircraft modeling, designing F1 cars (Virgin VR01)● TVs (mostly software), embedded systems
● Games and multimedia:● Computer chess (Deep Blue)● Watson (Jeopardy)● Analyzing multimedia content● Generating movies
Pixar’s ``Up’’ (2009)
Whole movie (96 minutes) would take 94 years on 1 PC
(4 frames per day; 1 second takes 6 days; 1 minute per year)
Some fundamental Computer Science topics
(1)● Operating systems:
● Windows, Linux, Minix (Andy Tanenbaum)
● Programming languages and systems● Fortran, Cobol, C, Java, Python … (thousands)
What happens if you ask a computer scientist to solve a problem?
He/she will come back 3 months later, with …
a new programming language ideally suited for solving your problem
Some fundamental Computer Science topics
(2)● Security
● Preventing/detecting attacks, privacy, etc
● (Semantic) web technology● Finding and reasoning about content on
the web
● Cloud computing● Store data and programs remotely, in the
Cloud
Some fundamental Computer Science topics
(3)● Artificial intelligence
● E.g. automatic machine-learning
● Databases● Storing and searching huge amounts of
data
● Logic, modelling, graph theory, complexity● Essential for many applications
Conclusion
● Modern Computer Science deals with hectic developments in technology and applications
● Both provide us many research problems● Application-driven vs technology-driven
research
● There also are many fundamental CS problems
Literature (Context)
● Ami Marowka: Back to Thin-Core Massively Parallel Processors, IEEE Computer, December 2011, pp. 49-54
QUESTIONS
● Explain what ``thin cores’’ are
● What are the arguments in favor and against using ‘’thin cores’’ ?
● Which role does energy consumption play in this discussion?
● Compute the energy efficiency of the current 10 largest supercomputers on www.top500.org
● Which type of machine currently is most energy efficient?
● Compare the maximum performance of the current #1 against the performance of the #1 of 10 years ago. What is the difference?