Difference between revisions of "Multiple Processor Runs on the OBCP Cluster"
(One intermediate revision by the same user not shown) | |||
Line 29: | Line 29: | ||
After starting a program using MPI, you may want to make sure that the program is running properly and all specified processors are in use. In the terminal, type the command '''<nowiki>top -u [username]</nowiki>''', where ''username'' is your username. This will show the processes that you are running that should include one program name entry for each processor specified with mpirun. If the program name is not shown, there was a problem starting the program using MPI. Open the error stream file for additional information on the problem. | After starting a program using MPI, you may want to make sure that the program is running properly and all specified processors are in use. In the terminal, type the command '''<nowiki>top -u [username]</nowiki>''', where ''username'' is your username. This will show the processes that you are running that should include one program name entry for each processor specified with mpirun. If the program name is not shown, there was a problem starting the program using MPI. Open the error stream file for additional information on the problem. | ||
− | Additionally, place an ampersand '''&''' at the end of the command and its option to detach the program from the terminal window | + | Additionally, place an ampersand '''&''' at the end of the command and its option to detach the program from the terminal window and include the '''nohup''' command in front of '''mpirun''' to allow the program running in parallel to become decoupled from the user. This is useful for long analyzes where staying logged into the OBCP cluster for the entire duration of the run is untenable. For example, '''''<nowiki>nohup mpirun -np 8 mb test.nex >out.txt 2>err.txt &</nowiki>'''''. |
[[Category:OBCP]] | [[Category:OBCP]] |
Latest revision as of 14:30, 26 September 2012
Multiple Processor Runs on the OBCP Cluster
This section of the wiki is designed to help a user run programs using multiple processors through a Message Passing Interface (MPI) on the OBCP cluster. All actions on the cluster require a user account. Send any requests for a new user account to Joe Cora.
If you are having trouble connecting to the OBCP cluster, visit the Accessing the OBCP Cluster page.
Contents
Multiple Processors with MPI
Background
Message Passing Interface (MPI) is a cross platform interface standard for spawning multiple parallel jobs on a number of remote or local multiprocessor machines. Since MPI is a protocol and not a program, there are many different implementations of MPI. On the OBCP cluster, OpenMPI is used to handle MPI functionality and requires SSH to spawn additional processes.
Setting Up SSH for MPI
For some background information or additional OpenMPI troubleshooting for using MPI, visit setting up the SSH keys on the OpenMPI website.
- Begin by generating an authentication key for SSH connections by typing the command below. When prompted, provide your key password. This is NOT the same as your login password, and this value may be blank.
- ssh-keygen -t dsa
- Make sure that the SSH agent is returning a value (e.g. Agent pid 25099) by typing the command below. Make sure to use the grave accent ` and not a single-quote character to specify the evaluated command.
- eval `ssh-agent`
- Save the generated authentication information to a file.
- ssh-add $HOME/.ssh/id_dsa
- Add the newly generated SSH authentication information to the SSH authorized keys file. This will allow SSH to be started by the user without the need to enter their login information every time.
- cat $HOME/.ssh/id_dsa.pub >> $HOME/.ssh/authorized_keys
Starting a Program with MPI
Running a program with MPI is easy. First decide how many processors, or pseudo-processors in the case of the OBCP cluster, are needed, then start the MPI-enabled program with the mpirun command. Use mpirun in the following format: mpirun -np [# of processors] [program name] > [output file]. Here is an example: mpirun -np 8 mb test.nex > out.txt, where 8 are the number of processors spawned, mb is the MPI-enabled MrBayes program, test.nex is the MrBayes input file, and out.txt is the output stream from the program redirected to this file. If you would like to also capture the error stream, append 2> [error file]. For example, ... >out.txt 2>err.txt.
After starting a program using MPI, you may want to make sure that the program is running properly and all specified processors are in use. In the terminal, type the command top -u [username], where username is your username. This will show the processes that you are running that should include one program name entry for each processor specified with mpirun. If the program name is not shown, there was a problem starting the program using MPI. Open the error stream file for additional information on the problem.
Additionally, place an ampersand & at the end of the command and its option to detach the program from the terminal window and include the nohup command in front of mpirun to allow the program running in parallel to become decoupled from the user. This is useful for long analyzes where staying logged into the OBCP cluster for the entire duration of the run is untenable. For example, nohup mpirun -np 8 mb test.nex >out.txt 2>err.txt &.