Presentation 2 Spring 2016 FINAL fat cut (1)
-
Upload
michael-vistine -
Category
Documents
-
view
22 -
download
0
Transcript of Presentation 2 Spring 2016 FINAL fat cut (1)
![Page 1: Presentation 2 Spring 2016 FINAL fat cut (1)](https://reader036.fdocuments.in/reader036/viewer/2022062316/58f018af1a28ab133e8b458f/html5/thumbnails/1.jpg)
Design, Implementation, and Characterization of a Raspberry Pi Cluster for
High-Performance Computing
Michael Vistine Katy Rodriguez Ralph Walker II
1/38
For our senior design project, we are testing high-performance computing using the Raspberry Pi 2. The Raspberry Pi 2 offers a powerful 900 MHz quad-core ARM CPU that will be tested to its limit by running different tests such as wired vs wireless, number of cores vs execution time, and temperature vs clock speed. The wired design is set up with one master pi communicating to three slave nodes via router that we are using as a switch. The master pi runs the test program while it is SSH to the slave pi’s which are the main horsepower while running our program through Open MPI.
![Page 2: Presentation 2 Spring 2016 FINAL fat cut (1)](https://reader036.fdocuments.in/reader036/viewer/2022062316/58f018af1a28ab133e8b458f/html5/thumbnails/2.jpg)
TEAM 5 MEMBERS
Michael Vistine Software
Engineer
2/38
Katy RodriguezIntegration Engineer
Ralph Walker Hardware Engineer
![Page 3: Presentation 2 Spring 2016 FINAL fat cut (1)](https://reader036.fdocuments.in/reader036/viewer/2022062316/58f018af1a28ab133e8b458f/html5/thumbnails/3.jpg)
OVERVIEW
Motivation Hardware/Design Description
Software Data Timeline/Current Status Conclusion & Questions
3/38
![Page 4: Presentation 2 Spring 2016 FINAL fat cut (1)](https://reader036.fdocuments.in/reader036/viewer/2022062316/58f018af1a28ab133e8b458f/html5/thumbnails/4.jpg)
MOTIVATIONS
4/38
Design• Cluster Computing • Compact• Active Cooling
Raspberry Pi• Low Cost multicore
processor• Open Source Code
Characterization of the Design• Nodes vs. Performance• Wireless vs. Wired
Performance• Passive vs. Active cooling
Photo courtesy of azchipka.thechipkahouse.com
![Page 5: Presentation 2 Spring 2016 FINAL fat cut (1)](https://reader036.fdocuments.in/reader036/viewer/2022062316/58f018af1a28ab133e8b458f/html5/thumbnails/5.jpg)
HARDWARE COMPARISON
5/38Photo courtesy of pcworld.com
Pi 1B+ Pi B 2 BeagleBone Pi 3
Processor 700 MHz 900-1000 MHz 1GHz 1.2GHz
Cores 1 4 1 4
RAM 512 MB 1 GB 512 MB 1 GB
Peripherals 4 USB Ports 4 USB Ports 2 USB Ports 4 USB Ports
Power Draw 0.31A 0.42A 0.46A 0.58A
Memory Micro SD slot Micro SD slot 2 GB on board & Micro SD
Micro SD slot
Price ~$30 ~$35 ~$55 ~$35
Photo courtesy of ti.com
Photo courtesy of adafruit.com
Photo courtesy of hifiberry.com
![Page 6: Presentation 2 Spring 2016 FINAL fat cut (1)](https://reader036.fdocuments.in/reader036/viewer/2022062316/58f018af1a28ab133e8b458f/html5/thumbnails/6.jpg)
HARDWARE
6/38
Photo courtesy of Amazon
• 2.4 amps per port• Multi-device charging• Surge protection
Anker 60W 6 Port USB Charger PowerPort
Photo courtesy of Amazon
Wireless Router TP-Link TL WR841N
• 300Mbps wireless connection
• Adjustable DHCP settings
• Wireless On/Off switch
• 4 LAN ports
![Page 7: Presentation 2 Spring 2016 FINAL fat cut (1)](https://reader036.fdocuments.in/reader036/viewer/2022062316/58f018af1a28ab133e8b458f/html5/thumbnails/7.jpg)
RPI1
DESIGN DESCRIPTION
7/38
Power RPI0 (Master Node)
RPI2
RPI3
Open MPI
Test.cRouter
![Page 8: Presentation 2 Spring 2016 FINAL fat cut (1)](https://reader036.fdocuments.in/reader036/viewer/2022062316/58f018af1a28ab133e8b458f/html5/thumbnails/8.jpg)
DESIGN DESCRIPTION
8/38
Final Design• Custom made 3D printed
enclosure using PTC Creo Elements
• Laser cut plexiglass• Wired/Wireless router• Heat sinks and PC fan• Power hub
Photo courtesy Katy Rodriguez
![Page 9: Presentation 2 Spring 2016 FINAL fat cut (1)](https://reader036.fdocuments.in/reader036/viewer/2022062316/58f018af1a28ab133e8b458f/html5/thumbnails/9.jpg)
OPERATING SYSTEM – RAPSBIAN JESSIE◦ Based on Debian Linux◦ Lightweight OS◦ Open source◦ Bash terminal interface◦ Kernel version 4.1◦ Pre-installed with
education programing languages
SOFTWARE
9/38Photo courtesy raspberrypi.org
![Page 10: Presentation 2 Spring 2016 FINAL fat cut (1)](https://reader036.fdocuments.in/reader036/viewer/2022062316/58f018af1a28ab133e8b458f/html5/thumbnails/10.jpg)
Bash terminal- used to:◦ Edit and create configuration files
Style of syntax used to operate in terminal◦ $ sudo apt-get install (“file”) – used to install files
OpenMPI:◦ Message Passing Interface used to implement
parallel computing◦ Takes the data and breaks it into smaller chunks
and distributes it to the nodes to run simultaneously
SOFTWARE
10/38
![Page 11: Presentation 2 Spring 2016 FINAL fat cut (1)](https://reader036.fdocuments.in/reader036/viewer/2022062316/58f018af1a28ab133e8b458f/html5/thumbnails/11.jpg)
First all packages were updated Finalize the configurations using sudo raspi-
config Settings for the master were the same as
the slave nodes:◦ Set the host names as rpi0◦ Enable ssh◦ Set the memory split to 16
SETTING UP THE MASTER
12/38
![Page 12: Presentation 2 Spring 2016 FINAL fat cut (1)](https://reader036.fdocuments.in/reader036/viewer/2022062316/58f018af1a28ab133e8b458f/html5/thumbnails/12.jpg)
Install all the same packages from the master node
sudo raspi-config to set all the same system preferences as the master node
SETTING UP SECOND PI
13/38
Photo courtesy of www.raspberrypi.org
![Page 13: Presentation 2 Spring 2016 FINAL fat cut (1)](https://reader036.fdocuments.in/reader036/viewer/2022062316/58f018af1a28ab133e8b458f/html5/thumbnails/13.jpg)
CALL_PROCS (TEST PROGRAM 1)
14/38
1. # include <stdio.h> //Standard Input/output library2. # include <mpi.h>
3. int main(int argc, char** argv)4. {5. //MPI variables6. int num_processes;7. int curr_rank;8. char proc_name[MPI_MAX_PROCESSOR_NAME];9. int proc_name_len;
10. //intialize MPI11. MPI_Init(&argc, &argv);
12. //get the number of processes13. MPI_Comm_size(MPI_COMM_WORLD, &num_processes);14.15. //Get the rank of the current process16. MPI_Comm_rank(MPI_COMM_WORLD, &curr_rank);
17. // Get the processor name for the current thread18. MPI_Get_processor_name(proc_name, &proc_name_len);
19. //Check that we're running this process.20. printf("Calling process %d out of %d on %s\r\n", curr_rank,
num_processes, proc_name);
21. //Wait for all threads ot finish22. MPI_Finalized();
23. return 0;24. }
•Creates user specified dummy processes of equal size
•Allocates the processes dynamically to each node
•Displays the process number upon completion
![Page 14: Presentation 2 Spring 2016 FINAL fat cut (1)](https://reader036.fdocuments.in/reader036/viewer/2022062316/58f018af1a28ab133e8b458f/html5/thumbnails/14.jpg)
#include <stdio.h>#include <math.h>#include <mpi.h>#define TOTAL_ITERATIONS 10000
int main(int argc, char *argv[]){//MPI variables… sum = 0.0;//determine step size h = 1.0 / (double) total_iter;//the current process will perform operations on its rank//added by multiples of the total number of threads// rank = 3, for(step_iter = curr_rank +1; step_iter <= total_iter; step_iter += num_processes)// resolve the sum into calculated value of picurr_pi = h * sum;//reduce all processes' pi values to one valueMPI_Reduce(&curr_pi, &pi, 1, MPI_DOUBLE, MPI_SUM, 0, MPI_COMM_WORLD); } // Print out the final value and error printf("calculated Pi = %.16f\r\n", pi); printf("Relative Error = %.16f\r\n", fabs(pi - M_PI)); //Wrap up MPI MPI_Finalize();
CALC_PI (MAIN PROGRAM)
15/38
This program calculates the value of pi the 10,000 times per thread
![Page 15: Presentation 2 Spring 2016 FINAL fat cut (1)](https://reader036.fdocuments.in/reader036/viewer/2022062316/58f018af1a28ab133e8b458f/html5/thumbnails/15.jpg)
SSH Keys generated and a passphrase is recommended ◦ A bitmap of random
characters was then generated as the key
Next key is copied to slave nodes
KEY GENERATION
16/38
Photo courtesy visualgdb.com
![Page 16: Presentation 2 Spring 2016 FINAL fat cut (1)](https://reader036.fdocuments.in/reader036/viewer/2022062316/58f018af1a28ab133e8b458f/html5/thumbnails/16.jpg)
Set all node IP addresses as static in ◦ sudo nano /etc/network/interfaces (edit on all
nodes) Set all hostnames to now static IP’s
◦ sudo nano /etc/hosts (edit on all nodes ) We were only able to set up either wired or
wireless static ips at one time to prevent conflict with the mounts
SETTING UP THE NETWORK
17/38
![Page 17: Presentation 2 Spring 2016 FINAL fat cut (1)](https://reader036.fdocuments.in/reader036/viewer/2022062316/58f018af1a28ab133e8b458f/html5/thumbnails/17.jpg)
Setting up the wireless connection was essentially the same as setting up the wired connection
/etc/network/hosts was edited and new ip addresses and hostnames were added
SETTING UP WIRELESS CONNECTION
18/38
![Page 18: Presentation 2 Spring 2016 FINAL fat cut (1)](https://reader036.fdocuments.in/reader036/viewer/2022062316/58f018af1a28ab133e8b458f/html5/thumbnails/18.jpg)
/ETC/HOSTS
Photo courtesy of Mike Vistine
19/38
![Page 19: Presentation 2 Spring 2016 FINAL fat cut (1)](https://reader036.fdocuments.in/reader036/viewer/2022062316/58f018af1a28ab133e8b458f/html5/thumbnails/19.jpg)
This figure shows the wireless setup of /etc/network/interfaces
SETTING UP THE WIRELESS CONNECTION
Photo courtesy of Mike Vistine
20/38
![Page 20: Presentation 2 Spring 2016 FINAL fat cut (1)](https://reader036.fdocuments.in/reader036/viewer/2022062316/58f018af1a28ab133e8b458f/html5/thumbnails/20.jpg)
Next a common user was created on all nodes to allow the nodes to communicate with out the need for repeated password entry
Next the nodes were mounted onto the master node
COMMON USER AND NFS
21/38
![Page 21: Presentation 2 Spring 2016 FINAL fat cut (1)](https://reader036.fdocuments.in/reader036/viewer/2022062316/58f018af1a28ab133e8b458f/html5/thumbnails/21.jpg)
sudo nano /etc/exports◦ Line added at bottom of file:◦ /mirror 192.168.0.0/24(rw,sync) [for wired]◦ /mirror 192.168.1.0/24(rw,sync) [for wireless]
These steps repeated for all slave nodes
COMMON USER AND NFS
22/38
![Page 22: Presentation 2 Spring 2016 FINAL fat cut (1)](https://reader036.fdocuments.in/reader036/viewer/2022062316/58f018af1a28ab133e8b458f/html5/thumbnails/22.jpg)
AUTOMOUNT SCRIPT• For each node /etc/rc.local
was edited• A few lines were added at the
end of the file to print “mounting network drives”
• This script was supposed to automatically mount the drives on boot
• The automount function was incredibly slow
Photo courtesy of Mike Vistine
23/38
![Page 23: Presentation 2 Spring 2016 FINAL fat cut (1)](https://reader036.fdocuments.in/reader036/viewer/2022062316/58f018af1a28ab133e8b458f/html5/thumbnails/23.jpg)
Log in as mpiu on master node using su – mpiu Switch to the /mirror/code/ mpicc calc_pi.c –o calc_pi time mpiexec –n 4 –H RRPI0-3 calc_pi
RUNNING OPENMPI
24/38
![Page 24: Presentation 2 Spring 2016 FINAL fat cut (1)](https://reader036.fdocuments.in/reader036/viewer/2022062316/58f018af1a28ab133e8b458f/html5/thumbnails/24.jpg)
The .c files and the executables in the directory in the screen shot
The execution of the program call_procs with mpiexec
OPENMPI
25/38
Photo courtesy of Mike Vistine
![Page 25: Presentation 2 Spring 2016 FINAL fat cut (1)](https://reader036.fdocuments.in/reader036/viewer/2022062316/58f018af1a28ab133e8b458f/html5/thumbnails/25.jpg)
CALC_PI TEST Here you can see an example of the format
while running the calc_pi test Each core and the number of threads are
designated in the MPI command
Photo courtesy of Mike Vistine
26/38
![Page 26: Presentation 2 Spring 2016 FINAL fat cut (1)](https://reader036.fdocuments.in/reader036/viewer/2022062316/58f018af1a28ab133e8b458f/html5/thumbnails/26.jpg)
In order for wireless mpi to work the mounts had to be set manually
The nfs kernel had to be restart each time the pi’s were powered off or rebooted
WIRELESS CALC_PI TESTS
27/38
![Page 27: Presentation 2 Spring 2016 FINAL fat cut (1)](https://reader036.fdocuments.in/reader036/viewer/2022062316/58f018af1a28ab133e8b458f/html5/thumbnails/27.jpg)
Wired vs Wireless performance ◦ Test the processing performance of cluster when:
Hard wired to router Using dongles for each node to communicate wirelessly
Computational benchmark tests◦ Using benchmark software to observe total processing
power across all pi’s◦ Using complicated program as test material to solve with
cluster Graphical performance info Implementation of practical applications Active Cooling of the Pi’s
◦ Fans implemented in final case design
DESIGN CHARACTERISTICS
28/38
![Page 28: Presentation 2 Spring 2016 FINAL fat cut (1)](https://reader036.fdocuments.in/reader036/viewer/2022062316/58f018af1a28ab133e8b458f/html5/thumbnails/28.jpg)
Wired vs Wireless performance
29/38
Wired performance did prove to be more efficient
The wireless values were inconsistent
Each record value per core was an average of three runs
![Page 29: Presentation 2 Spring 2016 FINAL fat cut (1)](https://reader036.fdocuments.in/reader036/viewer/2022062316/58f018af1a28ab133e8b458f/html5/thumbnails/29.jpg)
Temperature vs Clock Speed
30/38
Passive temperatures proved to be higher before and after running wireless data test.
Active cooling significantly improved temperature regulation of each pi.
![Page 30: Presentation 2 Spring 2016 FINAL fat cut (1)](https://reader036.fdocuments.in/reader036/viewer/2022062316/58f018af1a28ab133e8b458f/html5/thumbnails/30.jpg)
Active cooling vs passive cooling
31/38
Passive cooling results were very erratic. Active cooling results were consistent and
had better test times.
![Page 31: Presentation 2 Spring 2016 FINAL fat cut (1)](https://reader036.fdocuments.in/reader036/viewer/2022062316/58f018af1a28ab133e8b458f/html5/thumbnails/31.jpg)
TIMELINE
32/38
Aug. 28 – Sept. 27
Sept. 23 – Dec. 10
Oct. 11 – March 23
Jan. 4 – April 5
Feb. 9 – April 15
![Page 32: Presentation 2 Spring 2016 FINAL fat cut (1)](https://reader036.fdocuments.in/reader036/viewer/2022062316/58f018af1a28ab133e8b458f/html5/thumbnails/32.jpg)
BUDGET
33/38
Budget from ScratchTotal Project Budget
![Page 33: Presentation 2 Spring 2016 FINAL fat cut (1)](https://reader036.fdocuments.in/reader036/viewer/2022062316/58f018af1a28ab133e8b458f/html5/thumbnails/33.jpg)
All project tests are complete
Data has been collected for anaylsis
Case is 98% complete
CURRENT STATUS
34/38
![Page 34: Presentation 2 Spring 2016 FINAL fat cut (1)](https://reader036.fdocuments.in/reader036/viewer/2022062316/58f018af1a28ab133e8b458f/html5/thumbnails/34.jpg)
Add finishing details to documentation and case design
Make senior design day poster
Prepare for senior design day
IMMEDIATE TASKS
35/38
![Page 35: Presentation 2 Spring 2016 FINAL fat cut (1)](https://reader036.fdocuments.in/reader036/viewer/2022062316/58f018af1a28ab133e8b458f/html5/thumbnails/35.jpg)
Experiment completed
Wired proved to be faster and more reliable than wireless
Active cooling made a significant different in performance and temperature regulation
CONCLUSIONS
36/38
![Page 36: Presentation 2 Spring 2016 FINAL fat cut (1)](https://reader036.fdocuments.in/reader036/viewer/2022062316/58f018af1a28ab133e8b458f/html5/thumbnails/36.jpg)
http://www.python.org/doc/current/tut/tut.html
http://likemagicappears.com/projects/raspberry-pi-cluster/
http://www.zdnet.com/article/build-your-own-supercomputer-out-of-raspberry-pi-boards/
https://Youtu.be/R0Uglgcb5g
http://www.newegg.com/
http://www.amazon.com
http://anllyquinte.blogspot.com/
http://www.slideshare.net/calcpage2011/mpi4pypdf
SOURCES
37/38
![Page 37: Presentation 2 Spring 2016 FINAL fat cut (1)](https://reader036.fdocuments.in/reader036/viewer/2022062316/58f018af1a28ab133e8b458f/html5/thumbnails/37.jpg)
QUESTIONS??
38/38