Lab 10 (5 points)
CS550, Operating Systems
File Systems and Scheduling
Name:
_____________________________________________
To submit this assignment, you may copy and paste parts of the
assignment into a text editor such as nano, vi, notepad, MS Word,
OpenOffice Writer, etc. Zip any code and scripts showing the
output of your solutions, and submit the zip file to the dropbox
for lab 10. Be sure to include a text document including any
written/typed/graphed results.
The following lab is based in part upon labs provided at the TACC
Xeon Phi Tutorial from the XSEDE 2013 conference. Within this
lab, you will work with the Xeon Phi Accelerator and learn more about the
MPI_Status
structure, MPI_Waitsome
function, dining philosophers in MPI, and the lfs command to view and change
properties of the Lustre filesystem. You will also learn some
scripting basics.
Upload the code at this link to
Stampede. Next, compile the following file on stampede as
follows:
mpicc messages3.c -mmic
-O3 -o micMsg.exe
The command above compiles the file for the Intel Xeon Phi
Accelerator.
mpicc messages3.c -O3
-o hostMsg.exe
The command above compiles the file for the main CPU (Sandy Bridge)
on a node of Stampede.
Now run the following command:
idev -A TG-SEE120004
1. What happened?
Hint: this command retrieves an interactive environment for you from
Stampede for 30 minutes. You will have access to a node of
your own for a total of 30 minutes after running idev.
Every standard node on Stampede includes a Xeon Phi
accelerator. Currently code that is to be used on these
accelerators must be compiled using the Intel compiler and it must
use the Intel MPI library.
On High Performance Computing systems, a module toolkit is used to
allow users to choose between multiple variations of libraries that
do the same thing. For example, OpenMPI, MPICH2, MVAPICH2, and
IMPI are all different implementations of the Message Passing
Interface. They all do almost the same thing, but they may
offer slightly different features. Stampede uses the MVAPICH2
library by default (an MPI variant created by Ohio State
University). You must swap this out with Intel MPI. To
do this type the following:
module swap mvapich2
impi
To check that this change happened type:
module list
2. What happened?
This shows the modules (pieces of software) that have been
loaded. You should see Intel MPI listed in your loaded
modules.
Before working with the Xeon Phi, you will need to set up several
environment variables. Run the following commands:
source ./setup_mic.sh
export MIC_PPN=60
These commands set several environment variables. Look at the
setup_mic.sh
file and then run the command:
env
3. What happened? Were the environment variables set
properly?
Run the code 3 times by using ibrun.symm from the node under idev as follows:
ibrun.symm -c
hostMsg.exe
ibrun.symm -m micMsg.exe
ibrun.symm -c hostMsg.exe -m micMsg.exe
It is possible to use a variant of the mpiexec command as well. Recall
that it is possible to use mpiexec or mpirun on LittleFe, too.
mpiexec.hydra -env
LD_LIBRARY_PATH $MIC_LD_LIBRARY_PATH -env I_MPI_PIN_MODE mpd -env
KMP_AFFINITY balanced -n 60 -host mic0 ./micMsg.exe
mpiexec.hydra -n 16 -host localhost ./msgHost.exe
mpiexec.hydra -n 16 -host localhost ./msgHost.exe : -env
LD_LIBRARY_PATH $MIC_LD_LIBRARY_PATH -env I_MPI_PIN_MODE mpd -env
KMP_AFFINITY balanced -n 60 -host mic0 ./msgMIC.exe
Now exit idev
by typing
exit
and pressing enter.
4. What happened after running each of the six commands above?
Included at the following link
is a zip file containing a variant of the dining philosophers code.
5. Explain what this code does.
a) What does the philosopher function do?
b) How is deadlock prevented?
c) What does the server function do?
d) What are the parts of the MPI_Status struct?
e) What is the purpose of each of these
parts?
f) Why are these parts needed?
Explain with respect to the MPI_Waitsome function.
Notice that you must provide two command line arguments to run the
philosopher code. The number of philosophers (this should be
one less than the number of processes) and the number of meals for
each philosopher to eat.
Run the dining philosophers code using mpiexec.hydra under idev in the same
three ways as shown in the code above. Don't forget to include
the command line arguments. First, run the host version on the
Sandy Bridge cores of the nodes, second on the Xeon Phi card (called
a MIC and pronounced "mike"), and third on both the MIC and Sandy
Bridge cores at the same time.
Note that running on the main CPU sockets and the MIC at the same
time is called symmetric computing.
6. Include your results in your submission. Which runs take
the longest?
7. Review the code for the dining philosophers. Why do you
think the program that took the longest required so much time?
Consider the following in your answer:
The Sandy Bridge Processors on each Stampede node have the following
specifications:
Memory: 32GB
Clock Speed: 2.7GHz
Memory Bandwidth: 51.2 GB/s * 2
Vector length: 4 Double Precision words
Core count: 8 per socket * 2 sockets = 16
The Xeon Phi cards have the following specifications:
Memory: 8GB
Core count: 61
Hardware threads: 244
Clock speed: 1.1GHz
Memory bandwidth: 352 GB/s (on card)
Vector length: 8 DP words
Read about lustre at http://en.wikipedia.org/wiki/Lustre_(file_system)
8. What is a stripe? an OST? an OSS?
Finally, you will make use of the lfs command on Stampede.
9. First, run man lfs. What happens? What
is the purpose of lfs?
10. Next, run lfs check mds. What happens?
11. Then, run lfs getstripe ~. What
happens? What is the stripe size of your files? Do any
of your files use more than one stripe?