Jawwad A Shamsi Nouman Durrani Nadeem Kafi Systems Research Laboratories, FAST National University...

Click here to load reader

download Jawwad A Shamsi Nouman Durrani Nadeem Kafi Systems Research Laboratories, FAST National University of Computer and Emerging Sciences, Karachi Novelties.

of 18

Transcript of Jawwad A Shamsi Nouman Durrani Nadeem Kafi Systems Research Laboratories, FAST National University...

  • Slide 1
  • Jawwad A Shamsi Nouman Durrani Nadeem Kafi Systems Research Laboratories, FAST National University of Computer and Emerging Sciences, Karachi Novelties in Teaching High Performance Computing
  • Slide 2
  • High Performance Computing CPU Intensive Data Intensive
  • Slide 3
  • HPC Curriculum at BS level ` Single Course Multiple Courses CPU intensiveData intensiveCPU intensiveData intensive Can only cover either of CPU or Data intensive
  • Slide 4
  • HPC Curriculum at BS level Shared Memory Distributed Memory Significant to teach both aspects of HPC
  • Slide 5
  • HPC Curriculum at BS level Shared Memory Distributed Memory Pertinent to incorporate breadth of knowledge GPGPU OpenMP MPI Hadoop
  • Slide 6
  • Our Thesis for HPC Curriculum A single Course Breadth of knowledge CPU intensive and Data Intensive Shared Memory and Distributed Memory Systems
  • Slide 7
  • Novel Contributions of This Paper Introduced a Specific Course Introduced Students to Multiple HPC Platforms Impart Practical Knowledge CPU and Data Intensive Systems Shared Memory and Distributed Memory Systems
  • Slide 8
  • Pedagogical Goals GoalDescription G1Understand basic concepts of High Performance computing G2Provide practical experience on multiple HPC platforms G3Impart knowledge of Parallel programming algorithms G4Motivate students for advanced topics and learning
  • Slide 9
  • Assignments AssignmentDescriptionGoals A1MPI program to scatter computational task to worker nodes in the cluster and gather the results back at the root node. Learn techniques for process creation and task distribution. Understand methods for point to point and collective communication between processes using MPI A2For a large computational problem, identify opportunities for parallelism. Use OpenMP to solve the computational problem. Stimulate students thinking of applying parallelism. Learn parallel computing in shared memory architecture using OpenMP. A3Solve the movie-ratings problem from Netflix using Hadoop and MapReduce. Learn Data-Intensive Computing. Understand programming in Hadoop using MapReduce. A4Multiplication of large matrixes using GPUs Comprehend knowledge about GPUs. Learn GPGPU programming for CPU-intensive tasks.
  • Slide 10
  • WeekTopicAlignment with Goals 1Introduction to important concepts of HPCG1 2Task Division, interaction, and Solving Techniques. HPC Clusters. Introduction to MPI G1, G2, and G4 3MPI Communication, Message Passing. (A1)G2 and G4 4Dynamic Process creation in MPI. File access in MPI. Quiz 1 G2 and G4 5Shared Memory Clusters. Open MP Programming Assignment on Open MP (A2) G2 and G4 6First Hourly. Hybrid Clusters using MPI G2 and G4 7,8 Data Intensive Computing. Introduction to MapReduce G3 and G4 9,10 Hadoop Open Source Platform for MapReduce G3 and G4 11,12 Architecture of Hadoop. Assignment on Hadoop (A3) G3 and G4 13Hadoop and MapReduce : Applications G3 and G4 14 GPU/GPGPU Programming. Architecture concepts in GPU. G2 and G4 15 Grid, thread, and block concepts. Device to host communication. Programming Assignment on GPU (A4) G2 and G4 16Project Presentations G2 and G4 17Final Examination G1, G2, G3, and G4
  • Slide 11
  • Topics Covered S. NoTopic Systems 1 Configuration of Clusters 2Use of Cloud 3Network and Distributed File System Architecture 4Parallel Computing Architecture 5GPU Architecture Algorithms and Applications 6CPU Intensive Computing 7Data Intensive Computing 8Parallel Algorithms Programming on Distributed Memory 9 MPI 10Hadoop Programming on Shared Memory 11OpenMP 12GPU
  • Slide 12
  • PDC Topics Covered AreaTopics with Bloom Level ArchitectureTaxonomy (C), Multi-core (C), SMP (A), NUMA (C), ILP (C) AlgorithmsDivide and Conquer (A), Reduction (A), Recursion (A), Scan (C), Speedup (A), Task graph (A), Scatter (A), Gather (A), Multicast (A) ProgrammingGustafson's law (C), Amdhal's law (C), Shared Memory (A), Static and Dynamic mapping (A), load balancing (C), Synchronization (A), Critical regions (A), Compiler directives (A), Producer Consumer (A), Task/thread spawning (A), SPMD/SIMD (A), Hybrid (A), Distributed Memory (A), Client/Server (A), Data Parallelism (A) Data Locality (A), Work Stealing (K) Advanced TopicsCluster Computing/Grid computing (A), Cloud Computing (A), Web Search (C/A) Social Networking (C), Distributed File System (A), GPU Architecture (C/A)
  • Slide 13
  • Student Evaluations
  • Slide 14
  • Student Marks in Assignments Assignment AverageMedianStd. DevMinMax A1 (MPI) 89.6910014.160100 A2 (O PEN MP) 95.61008.9670100 A3 (H ADOOP ) 87.959011.9060100 A4 (GPU) 85.309016.9050100
  • Slide 15
  • Overall Assessments of Students
  • Slide 16
  • Students feedback about Multiple Platforms QuestionResults Multiple HPC platforms enhanced Learning Strongly Agree (63.63%), Agree (31.81%), Neutral (4.54%), Disagree(0%), Strongly Disagree(0%) Programming Helped In Learning Strongly Agree (77.27%), Agree (13.63 %), Neutral (9.09%), Disagree(0%), Strongly Disagree (0%) Group Discussion and Interactive Style helped in Learning Strongly Agree (63.63%), Agree (18.18%), Neutral (13.63%), Disagree (4.54%), Strongly Disagree(0%)
  • Slide 17
  • Overall Marks given by Students AverageMinMedianMax 97910
  • Slide 18
  • Conclusion Programming provides an effective method for learning. -Using multiple HPC platforms provided an effective way of learning. -Both data and CPU intensive computing are needed to be covered in parallel computing course. -Cutting edge topics such as GPU (CUDA), Hadoop, and Cloud computing are very popular among students -Interactive learning, peer discussion, and group discussions are effective in teaching. -Students feedback should be incorporated for elective courses.