UA HPC High Performance Computing


The University of Arizona provides high-performance computers for processing large amounts of data. You can get access if you have a UA related need to use them. Unix shell and shell scripting experience are going to be very useful.




Terminology

shell / bash / terminal / command prompt - These terms are roughly interchangeable for our purposes. This is the text interface to Unix systems that you access using SSH. You enter commands, and it performs an action such as running a script or opening a program. A 'bash' or 'shell' script is simply a list of these commands.

home directory - This is your user's personal directory. It is represented by '~' (tilde).



Basic Unix Tutorial

If you are unfamiliar with using Unix/Linux type operating systems, here are some tutorials that cover basics such as navigating and listing files and directories (tutorial one), moving/copying/removing files (tutorial two) and file permissions (tutorial five).

For a guide on how to compile and install software packages on Unix (using configure/make/make install, etc) see tutorial seven.

Unix Tutorials

Creating Your HPC Account

Before you can begin using the HPC systems, you will need to create an account. Creating your account requires a sponsor. The sponsor will register your NetID with HPC, allowing you to sign up for an account. The sponser must be a UA professor.

The HPC account sign-up page is located at https://account.arizona.edu. The page includes further instructions for getting your account set up.



Using Your HPC Account

Logging In

You can access the HTC systems through SSH. This includes both command line usage and programs that use the X11 windowing system. Your username is your NetID, and your password is your NetID password.

OS X

On OS X this can be accessed with the Terminal application, located in /Applications/Utilities. Use the command "ssh @login.hpc.arizona.edu" to log on. Enter your password when prompted.

Windows

You will need to download an SSH client. PuTTY and TeraTerm Pro are a couple of free SSH programs for Windows.

PuTTY

http://www.chiark.greenend.org.uk/~sgtatham/putty/download.html

TeraTerm Pro

http://en.sourceforge.jp/projects/ttssh2/releases/

Linux

Use the command "ssh @login.hpc.arizona.edu" at the terminal and enter your password when prompted.



How To Download a File To HPC

It is useful to be familiar with the process of downloading a file from the internet directly to HTC.

  1. It is recommended to create a "downloads" folder in your in your home directory. This will keep unneeded files from accumulating in your home directory. Enter mkdir ~/downloads at the prompt to create it. If you have already created it, change to your download directory.
  2. Find the URL of the file you wish to download and copy it. If you are browsing the internet and you see a link to the file, right click on the link and choose "Copy Link Address" or similar.
  3. Type "wget", then a space, then paste the address you have copied.
  4. Press enter.

The file will begin downloading to the current directory.



Installing Programs To HPC

Sometimes it is also helpful to able to install software into your own user directory. The instructions are likely to vary depending on the software but here are some tips.

You don't need administrator (root) access to install most software, as long as you install it into your home directory.

I recommend creating a directory "~/opt" to store your software. Enter mkdir ~/opt to do that.

If you are installing software using configure, make, make install, enter ./configure PREFIX=~/opt to set it to install to that directory.

Here is a tutorial for using the configure, make, and make install commands: http://www.ee.surrey.ac.uk/Teaching/Unix/unix7.html.

Here is a link to another tutorial explaining how to use configure / make / make install.

When you type a command into the prompt, the shell searches your PATH environment variable for the command. Enter echo $PATH into the terminal to print the current value. This is a colon-separated list of directories the shell uses to determine which executable file to run when a user enters a command. Appending a directory to the PATH environment variable will allow the shell to run executable files in the directory without specifying the absolute path to the excecutable file. We accomplish this by adding a line to our ~/.bashrc configuration file. Enter echo export PATH=$PATH:$HOME/opt/bin >> ~/.bashrc to add the necessary line to your configuration file.

export PATH=$PATH:$HOME/opt/bin is the line that needs to be inserted into the configuration file. This exports the modified PATH environment variable each time you log in.

echo prints the line, and >> appends the output of echo to the file ~/.bashrc.



PBS Job Scripting

The HPC computing systems use software called PBS (Portable Batch System) to schedule the available resources. You can view the UITS documentation for PBS here: http://rc.arizona.edu/hpc-htc/using-systems/batch-system



PBS Scripts

A PBS script is a shell script that contains PBS directives. A single PBS directive looks like this: #PBS -N asmbl.

This command sets the name of the PBS job to "asmbl".

Commonly used PBS directives:

-N [NAME] - sets the name of your job in the PBS system

-W group_list=[GROUPNAME] - sets the group you belong to. Use va on the terminal to determine which group you belong to.

-M [EMAILADDRESS] - sends an email to you with the exit status of the job when it finishes

-q [QUEUENAME] sets the queue that your job will be in. e.g., standard or windfall

-l select=1:[SELECTSTATEMENT] allocates cpus and memory for your job, e.g. ncpus=12:mem=23gb

-l cput=[CPUTIME] allocates cpu time for your job, e.g. 50:0:0

-l walltime=[WALLTIME] allocates wallclock time for your job, e.g. 7:0:0

Take a look a the University of Arizona PBS documenation: http://rc2.webhost.uits.arizona.edu/hpc-htc/using-systems/pbs-example.



Commands

Here are some useful PBS commands to use at the terminal:

qsub

This submits a PBS script to the server. It will be placed in a queue and usually starts running within a few minutes.

qstat

Print the PBS job queues

qstat | grep [USERNAME]

Prints any jobs submitted by [USERNAME]

qdel -W force [JOB_ID]

This removes a job from the queue, stopping it if necessary. Use qstat to find the JOB_ID (e.g., 18426.service0).



Module System

The HPC systems use a "module" system to dynamically load an environment for a specific piece of software.

Here are some useful commands:

module help

Gives you a listing of all the commands and some instructions Run "man module" for much more information

module avail

This lists all of the modules that are available

module show

Displays information about what a certain module is and does. Gives you some URLs for the software, and also tells you where the program executable is located in the filesystem. This location will be dynamically prepended to your path when you use "module load".

module load

Loads the specified module. Prepends the module's "prepend-path" variable with your $PATH env variable. References Ice User Guide PBS Pro User Guide (PDF)

For more information, read the official documentation here: http://www.u.arizona.edu/~ferng/HPC/sgi_ice_user_guide.html#Modules.



Xdisk

Xdisk is used on the HTC systems to allocate storage space for your use. Your quota without using xdisk is only 5GB so xdisk must be used to store larger amounts of data.

Useful commands:

Enter quota at the command line to view your storage space usage information.

Enter xdisk -c query to view information about your current xdisk allocation, if you have one.

Enter xdisk -c expire [TIME] to extend the life of your xdisk allocation

For more information, read the official documentation here: http://www.u.arizona.edu/~ferng/HPC/sgi_ice_user_guide.html#Using_Xdisk_and_scratch_Storage.