Difference between revisions of "Multiple Processor Runs on the OBCP Cluster"

From xBio:D Wiki
Jump to navigation Jump to search
Line 25: Line 25:
  
 
=== Starting a Program with MPI ===
 
=== Starting a Program with MPI ===
Running a program with MPI is easy. First decide how many processors, or pseudo-processors in the case of the OBCP cluster, and start the MPI-enabled program with the '''''mpirun''''' command. Use mpirun in the following format: '''<nowiki>mpirun -np [# of processors] [program name] > [output file]</nowiki>'''. Here is an example: '''''mpirun -np 8 mb test.nex > out.txt''''', where ''8'' are the number of processors spawned, ''mb'' is the MPI-enabled MrBayes program, ''test.nex'' is the MrBayes input file, and ''out.txt'' is the output stream from the program redirected to this file. If you would like to also capture the error stream, append '''<nowiki>2> [error file]</nowiki>'''. For example, '''''<nowiki>... >out.txt 2>err.txt</nowiki>'''''.
+
Running a program with MPI is easy. First decide how many processors, or pseudo-processors in the case of the OBCP cluster, are needed, then start the MPI-enabled program with the '''''mpirun''''' command. Use mpirun in the following format: '''<nowiki>mpirun -np [# of processors] [program name] > [output file]</nowiki>'''. Here is an example: '''''mpirun -np 8 mb test.nex > out.txt''''', where ''8'' are the number of processors spawned, ''mb'' is the MPI-enabled MrBayes program, ''test.nex'' is the MrBayes input file, and ''out.txt'' is the output stream from the program redirected to this file. If you would like to also capture the error stream, append '''<nowiki>2> [error file]</nowiki>'''. For example, '''''<nowiki>... >out.txt 2>err.txt</nowiki>'''''.
  
 
After starting a program using MPI, you may want to make sure that the program was properly started and all specified processors are in use. In the terminal, type the command '''<nowiki>top -u [username]</nowiki>''', where ''username'' is your username. This will show the processes that you are running that should include one program name entry for each processor specified with mpirun. If the program name is not shown, there was a problem starting MPI on the program.
 
After starting a program using MPI, you may want to make sure that the program was properly started and all specified processors are in use. In the terminal, type the command '''<nowiki>top -u [username]</nowiki>''', where ''username'' is your username. This will show the processes that you are running that should include one program name entry for each processor specified with mpirun. If the program name is not shown, there was a problem starting MPI on the program.

Revision as of 14:50, 29 August 2012

Multiple Processor Runs on the OBCP Cluster

This section of the wiki is designed to help a user run programs using multiple processors through a Message Passing Interface (MPI) on the OBCP cluster. All actions on the cluster require a user account. Send any requests for a new user account to Joe Cora.

If you are having trouble connecting to the OBCP cluster, visit the Accessing the OBCP Cluster page.

Multiple Processors with MPI

Background

Message Passing Interface (MPI) is a cross platform interface standard for spawning multiple parallel jobs on a number of remote or local multiprocessor machines. Since MPI is a protocol and not a program, there are many different implementations of MPI. On the OBCP cluster, OpenMPI is used to handle MPI functionality and requires SSH to spawn additional processes.


Setting Up SSH for MPI

For some background information or additional OpenMPI troubleshooting for using MPI, visit setting up the SSH keys on the OpenMPI website.


  1. Begin by generating an authentication key for SSH connections by typing the command below. When prompted, provide your key password. This is NOT the same as your login password, and this value may be blank.
    ssh-keygen -t dsa
  2. Make sure that the SSH agent is returning a value (e.g. Agent pid 25099) by typing the command below. Make sure to use the grave accent ` and not a single-quote character to specify the evaluated command.
    eval `ssh-agent`
  3. Save the generated authentication information to a file.
    ssh-add $HOME/.ssh/id_dsa
  4. Add the newly generated SSH authentication information to the SSH authorized keys file. This will allow SSH to be started by the user without the need to enter their login information every time.
    cat $HOME/.ssh/id_dsa.pub >> $HOME/.ssh/authorized_keys


Starting a Program with MPI

Running a program with MPI is easy. First decide how many processors, or pseudo-processors in the case of the OBCP cluster, are needed, then start the MPI-enabled program with the mpirun command. Use mpirun in the following format: mpirun -np [# of processors] [program name] > [output file]. Here is an example: mpirun -np 8 mb test.nex > out.txt, where 8 are the number of processors spawned, mb is the MPI-enabled MrBayes program, test.nex is the MrBayes input file, and out.txt is the output stream from the program redirected to this file. If you would like to also capture the error stream, append 2> [error file]. For example, ... >out.txt 2>err.txt.

After starting a program using MPI, you may want to make sure that the program was properly started and all specified processors are in use. In the terminal, type the command top -u [username], where username is your username. This will show the processes that you are running that should include one program name entry for each processor specified with mpirun. If the program name is not shown, there was a problem starting MPI on the program.

Additionally, place the nohup command in front of mpirun to allow the program running in parallel to become decoupled from the user. This is useful for long analyzes where staying longed into the OBCP cluster for the entire duration is untenable.