How to run UPC code at MTU
UPC Home

This is a FAQ for UPC programmers at MTU about how to compile and run UPC code on the various CS&E machines in the CEC.

Topics

General information

Q1: What UPC compilers do we have? On what platforms do they run?

Machine name Architecture UPC compilers Documentation
gilbert.cse.mtu.edu Linux x86 cluster
26 2-way 1.8GHz Dual-Core Xeon nodes
Infiniband interconnect (only nodes 1-24)
Gigabit Ethernet interconnect
MuPC

Berkeley's UPC

http://www.upc.mtu.edu/MuPCdistribution/

http://upc.nersc.gov/docs/

lionel.cse.mtu.edu Linux x86 cluster
20 2-way 2.0GHz Pentium nodes
Mryinet interconnect (only nodes 1-16)
Gigabit Ethernet interconnect (only nodes 1-16)
MuPC

Berkeley's UPC

http://www.upc.mtu.edu/MuPCdistribution/

http://upc.nersc.gov/docs/

flyer.cse.mtu.edu AlphaServer cluster
8 4-way ES40 nodes with
833MHz Alpha EV68 21264s
Quadrics interconnect
MuPC

Berkeley's UPC

HP's UPC

http://www.upc.mtu.edu/MuPCdistribution/

http://upc.nersc.gov/docs/

http://www.hp.com/go/upc

flash.cse.mtu.edu Cray T3E
48 Alpha 21064s
Cray interconnect
GCC-UPC

Bill's UPC

http://www.intrepid.com/upc/gcc-upc-manpage.html

(No longer used.)

Q2: How do I run everything on everything?

Start by logging in, for example, with the command ssh lionel.cse.

1. MuPC
Compile:
On lionel and gilbert: /usr/local/MuPC/bin/mupcc -f num myprog.c
On flyer: /usr/local/MuPC/bin/mupcc -f num myprog.c
Execute:
MuPC is really an MPI program. Run it just as you would run an MPI program.
On lionel and gilbert: mpirun -np num ./a.out
On flyer: prun -I -n num ./a.out

2. Berkeley's UPC
Compile:
On lionel and gilbert: /usr/local/berkeley_upc/bin/upcc -T num myprog.c
On lionel (old): /home/bonachea/.upc-dist/inst/bin/upcc -T num myprog.c
On flyer: /usr/local/berkeley/upc/stable/runtime/inst/bin/upcc -T num myprog.c
Execute:
On lionel and gilbert: /usr/local/berkeley_upc/bin/upcrun -n num ./a.out
On lionel (old): /home/bonachea/.upc-dist/inst/bin/upcrun -n num ./a.out
On flyer: /usr/local/berkeley/upc/stable/runtime/inst/bin/upcrun -n num ./a.out

Berkeley's UPC needs a configuration file in order to work smoothly. On lionel, you can create the configuration file by copying /home/zhazhang/.upccrc to your home directory. On flyer, please copy /usr/users/zhazhang/.upccrc to your home directory. On gilbert, the default configuration should be sufficient.

3. HP's UPC
Compile:
/usr/bin/upc -fthreads num myprog.c
Execute:
prun -I -n num ./a.out

4. GCC-UPC
Compile:
You need to load the gcc-upc module every time you log on.
module load gcc-upc
upc -fupc-threads-num -x upc myprog.c
Execute:
./a.out

Q3: Where can I get more information about how to use MuPC?
Go to the web for MuPC.

Q4: Where can I have more information about how to use HP's UPC?
(1) Read the man page. (2) Go to the web for HP's UPC.

Q5: Where can I have more information about how to use Berkeley's UPC?
(1) Try "upcc -help". It lists all possible command line options. (2) Go to the web for Berkeley's UPC.

Q6: Where can I have more information about how to use GCC-UPC?
Go to the web for GCC-UPC's homepage.

Q7: How many UPC threads can I launch for my program?
It is generally a good idea to have one UPC thread per node. Having multiple threads per node is possible but usually leads to very poor performance. The only exception is on flyer, where we can have 2 UPC threads per node and still get good performance. Therefore, the suggested maximum values for THREADS on the three platforms are listed below:

lionel: 16 nodes, 16 UPC threads (32 threads possible, but for testing only)
flyer: 8 nodes, 16 UPC threads with 2 threads per node
flash: 48 nodes, 48 UPC threads

Please read more on Q1, Q4 and Q5 in the "Run time system configuration" section.

Run time system configuration

Q1: When running MuPC, how do I select which nodes to use on lionel?
Let lionel decide for you. Lionel will choose the least busy nodes and execute your program on them.
- OR -
Use a machine file. A machine file is a plain text file containing 16 lines, with each line specifying one node name. Node names are n1, n2, ..., n16. The order in which the nodes appear doesn't matter. The run time system always picks the first node to run Thread0, the second to run Thread1, and so on. The machine file name must be supplied in the command line using the -machinefile switch, as in the following:
mpirun -np 16 -machinefile mf ./a.out
Please note that only nodes n1 to n16 are eligible nodes to be in the machine file. Read more in Q2 and Q3.

Q2: What happens if some nodes in my machine file are down?
You can still run your programs as long as there are enough running nodes for your purposes. But you have to take the faulty ones out of your machine file by deleting them from the machine file or prepending each of them with a # sign.

Q3: When should I supply a machine file?
Only if you want to use specific nodes.

Q4: How do I specify which nodes to use on flyer? Do I need a machine file also?
On flyer you don't need a machine file.
You specify the layout of MuPC threads using the command line, for example:
1. 2 UPC threads using 2 nodes: prun -I -n 2 -N 2 ./a.out
2. 2 UPC threads using 1 node: prun -I -n 2 -N 1 ./a.out
3. 4 UPC threads using 4 nodes: prun -I -n 4 -N 4 ./a.out
4. 4 UPC threads using 2 nodes: prun -I -n 4 -N 2 ./a.out
5. 4 UPC threads using 1 node: (poor performance, not recommended) prun -I -n 4 -N 1 ./a.out
6. 16 UPC threads using 8 nodes: prun -I -n 16 -N 8 ./a.out
and so on.
You've got the idea. For more information, please refer to the man pages of prun and allocate.

Q5: How do I specify which nodes to use on flash?
You have no control of processor allocation on flash. The system automatically spawns UPC threads for you on the least busy nodes.

Q6: How do I set the cache size for MuPC?
Create a file named mupc.conf in your home directory. Add the following three lines to this file:
CACHE_LINE_LENGTH 1024
CACHE_TABLE_SIZE 256
SHARED_MEM_SIZE_PER_THREAD 268435456

The first two lines govern the geometry of the cache. The values shown above are the default settings. Modify those values to change the cache size. Note that those values must always be powers of 2 with the following exception: setting the values in the first two lines to zeros turns off the cache. The maximum value for CACHE_TABLE_SIZE is 1024, and the maximum value for CACHE_LINE_LENGTH is 8192. Read more in Q7.

Q7: What exactly is the structure of MuPC's cache?
Cache in MuPC is a non-coherent, direct-mapped, write-back cache. Each UPC thread maintains a cache with (THREADS-1) blocks, with each block for references made to every other thread. Each block has CACHE_TABLE_SIZE lines; each line has CACHE_LINE_LENGTH bytes.

Q8: What is the third line (last line) in mupc.conf for?
This line specifies the amount of heap space available for UPC's dynamic memory allocation. You can enlarge this value if you encounter an "insufficient memory" problem. But remember that the real amount of heap space is limited by the physical memory resources on the platform.

Q9: How do I set the cache size for HP's UPC?
The following environment variables control the cache behavior for HP's UPC:
UPCRTS_USE_CACHE (default FALSE, set to TRUE to turn on caching)
UPCRTS_CACHE_SETS (default 128, similar to CACHE_TABLE_SIZE in MuPC)
UPCRTS_CACHE_BLOCK_SIZE (default 64 bytes, similar to CACHE_LINE_LENGTH in MuPC)
UPCRTS_CACHE_ASSOCIATIVITY (default 4) Please refer to the man page of HP's UPC for more details.

Q10: Do I need to re-compile my code every time I change the cache size?
No. These settings take effects at run time only.

Q11: How do I set the cache sizes for Berkeley's UPC and GCC-UPC?
As far as we know, Berkeley's UPC and GCC-UPC do not have caching facilities yet.

Q12: How do I turn caching on and off?
See Q6 or Q9.

Q13: How do I set the heap size for dynamic shared memory allocation in MuPC?
See Q8.

Q14: How do I set the heap size for dynamic shared memory allocation in HP's UPC?
Modify the LIBELAN_ALLOC_SIZE environment variable.

Q15: How do I set the heap size for dynamic shared memory allocation in Berkeley's UPC?
Use the "-shared-heap" switch at compile time. For example:
upcc -T 4 -shared-heap=256MB prog.c
upcc -T 4 -shared-heap=1GB prog.c
and so on.

Q16: How do I set the heap size for dynamic shared memory allocation in GCC-UPC?
We don't know yet. If you figure it out, please let us know.

Questions about using lionel

Q1: When running MuPC, how do I specify which nodes to use on lionel?
See Q1 in the "Run time system configuration" section.

Q2: I am used to using LAM MPI. Why can I not use LAM anymore?
MuPC on lionel is configured to use MPICH-GM. Your PATH environment variable should contain an entry for /usr/local/mpi/bin. This way MuPC can find the correct mpicc and mpirun. See Q3 for more information.

Q3: How to edit my PATH environment variable to get the right MPI for MuPC on lionel?
Edit the .bashrc file in your home directory (assuming you are using bash).
Add /usr/local/mpi/bin and /usr/local/MuPC/bin to the head of the PATH environment variable redefinition.
Then log out and log in again.

Q4: Why does MuPC hang on lionel?
It's because of a bug in your program. If you are certain your program is correct, there could be something wrong with the system.
See Q1 in the "Run time system configuration" section.

Q5: Why do I get "cannot open GM port" error messages, or something similar?
It might be a problem with the system, Please see Q7 below.
Sometimes the Myrinet network on lionel is misbehaving. Send a note to Christopher K. Pinnow (ckpinnow@mtu.edu).

Q7: I do have a machine file specified in the command line, but MuPC still fails. Why?
Check your machine file.
On lionel, only nodes n1 to n16 are connected by the Myrinet switch. Nodes n17, n18, n19 and n20 are not on the Myrinet network (nor are they on the Gigabit Ethernet network). None of them should be listed in the machine file. You should also not include the front end node, lionel.cse.mtu.edu, in the machine file. It is not on the Myrinet network either and it is not a compute node.

Questions about using flyer

Q1: What MPI implementation are we using on flyer? Do I need edit the PATH environment variable in order to run MuPC?
We use Quadrics MPI library on flyer. This is the default MPI installation. You do not need to edit your PATH to run MuPC on flyer.

Q2: How do I specify which nodes to use on flyer? Do I need a machine file also?
See Q4 in the "Run time system configuration" section.

Q3: How do I set the cache size for HP's UPC on flyer?
See Q9 in the "Run time system configuration" section.

Q4: Why do I get an "insufficient memory" message when using HP's UPC?

Q5: Why does dynamic allocation fail when using HP's UPC?
See Q14 in the "Run time system configuration" section.

Q6: Why do I get "cannot allocate resource now" error messages on flyer?
This means not enough nodes are available currently, some nodes have been occupied by other users. Just wait and try it some time later. You can use the command rinfo to see which nodes are currently free.

Q7: Why can't the block size of a shared object exceed 1024?
This is a limitation of some old versions of HP's UPC. In the current version, the -wide option in the command line allows you to have a much larger block size. Read the man page for more information.

Questions about using flash

Q1: How do I specify which nodes to use on flash?
See Q5 in the "Run time system configuration" section.

Q2: Why doesn't "logout" work on flash?
Because the default shell on flash is ksh. Use exit please, or you can change your default shell to csh using the chsh command.

Q3: How to set the cache size for GCC-UPC on flash?
GCC-UPC doesn't have a cache yet.

Q4: How to set the heap size for dynamic shared memory allocation in GCC-UPC?
See Q16 in the "Run time system configuration" section.


© 2004 Michigan Technological University
Last modified 7/9/4